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Introduction 


This chapter is an overview of the instruction set and provides an introduction to the detailed 
instruction set descriptions. The processor architecture is summarized in Chapter 2: 
Architecture Overview on page 13. 


Background concepts 


It is important to note that the instruction set is a macro level instruction set. This has the 
advantage that a very regular instruction set can be provided where, for example, almost 
any types of operands and addressing modes can be used with any instruction. However, 
this does mean that each source macro instruction may result in one or more ‘real’ 
instructions being generated for the target architecture. The instruction set documentation 
indicates which instructions are implemented directly and which are macro instructions. 


Instruction format 


Assembly language syntax is described in detail the in the SDK Reference Manual. A 
summary of the relevant features is included here. 


Basic syntax 


An instruction consists of a mnemonic followed by a number of comma separated operands. 
The instruction is terminated with a semicolon or new line. Multiple instructions can appear 
on one line, separated by semicolons. 


ld O:m2, x value; add O:p2, O:p2, O:m2; 


Mnemonics 


Instruction mnemonics describe the operation to be performed. Instruction names are 
alphabetic, with dots separating the fields of an instruction name. For example, the compare 
instructions all have the form cmp. cc, where the condition code, cc, is eq for ‘compare 
equal’, gt for ‘compare greater than’, and so on. 


Operands 


Operands have the form value: specifiers, where the : and the specifiers are optional 
components. The va/ue can be a numeric literal or a label and refers to a register or an 
immediate. The specifiers are four fields after the colon: domain, size, type and vector size. 
The specifiers must appear in this order. 


Domain is either m, p or i for mono, poly or immediate. For some instructions, an 
immediate operand must be a literal constant (or a value that can be evaluated as a 
constant at assembly-time) and not a label. In the instruction set documentation, the 
notation i1 is used for instructions where the operand can be an immediate or a 
label. 


Size is normally 1 to 8 (bytes). 
Type is u, s or f for unsigned, signed or float respectively. 
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Vector size is used in those specific instructions which take vector arguments. These 
indicate a contiguous group of vector size elements, each of width size, in the mono 
or poly register file. Square brackets are used to indicate the vector size. 


Note that for some operations involving data movement the size field can be up to 64 bytes. 


The operand structure is shown in Figure 1. This example indicates two double precision 


POOL, | 


: vector size 
value (register address) (optional) 
domain (poly) size (8 bytes) type (floating point) 


Note: 


1.2.2 


Figure 1. An illustration of the fields for an instruction operand 


floating-point (8-byte) values in the poly register file starting at location 10. 


The value field may be a numeric literal or a label. The value field can be interpreted as 
either a register or an immediate, depending on the domain. 


If the domain is mono or poly, the value field is considered to be a register address. 
This specifies a byte index into the register file of the appropriate domain. For example, if 
there are 32 16-bit mono registers, the address will be in the range 0-63. 


Literals 


Numeric values can be expressed in decimal, octal, hexadecimal or floating point notation. 
Octal and hexadecimal numbers are represented with the C syntax, for example, 0177 
(octal) and 0x1234 (hex). The type (integer or floating point) can be deduced from the 
syntax of the number. Double precision (64 bit) floating-point values are indicated by 
appending : 8; for example, 3.5 is assumed to be a 32-bit value, 3.5:8 is a 64-bit float. 


Character literals can be used as integer values; they are enclosed in single quotes. 


Instruction domain 


Perhaps the most important feature of the assembly language is that the instructions rely on 
the operands to determine the domain and type of the operation. This means, for instance, 
that there is a single multiply instruction, mul which is used for integer and floating-point 
multiplication of different sizes in the mono and poly domains. The operands themselves 
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dictate which multiply instructions should be generated. Table 1 shows an example of how 
this works for a selection of operands. 


Operation 


Operand 1 Operand 2 | Operand 3 


(destination) |(source 1) | (source 2) Meaning 


mul 


O:m4s 4:m4s 8:m4s 4 byte mono signed multiply 
O:m2u 2:m2u 2:m2u 2 byte mono unsigned multiply 
0:p2s 2:pls 3:pls 1 byte poly signed multiply with 2 byte result 


O:m4f 4:m4f 12:m4f£ 4 byte floating point multiply 


1.3 


1.4 


Table 1. Example operand types 


Instructions typically have 0 to 3 operands. A few special instructions have more, for 
example, call. 


As a general rule, if any of the operands is poly, the instruction is executed on the poly 
execution unit; otherwise on the mono execution unit. For instructions with no operands the 
description should make it clear; so, for example, endif is executed on the poly execution 
unit and all branches are executed on the mono execution unit. 


Instruction types 


There are three types of instructions implemented in the processor, although the difference 
between these is rarely visible to the programmer. 


Hardware instructions are implemented directly by the processor. These are generally 
mono instructions. 


Microcode instructions are implemented on the PE array as a sequence of micro- 
instruction steps. The use of microcode allows the multiple function units in the PE to be 
driven in parallel (rather like a VLIW processor) and allows the instruction set to be 
optimized for specific applications. 


Macro instructions consist of a sequence of other instructions to define a higher-level 
operation. A set of useful instructions are defined as macros to make the instruction set 
more ‘regular’. Some macros require the use of temporary registers. This is described in 
more detail below. 


The predefined macro instructions can be used by including the appropriate header file. Or 
they can all be included by using the header file, macro _includes.inc. 


Instruction constraints 


Although there is great flexibility in combining operand domains, widths and types there are 
some restrictions on what is allowed. Some of these constraints apply across the whole 
instruction set, others are per-instruction restrictions. Also, note that some of the general 
restrictions are relaxed on individual instructions. For example, cast allows a one-byte 
mono destination, whereas most instructions do not (see Section 1.9.1: Instruction 
constraints on page 10). 
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1.4.1 


1.4.2 


1.4.3 


1.4.4 


1.4.5 


1.5 


The general constraints are described in the following sections. Note that individual 
instructions can override any of these constraints. 


Alignment 


The instruction set requires that all register operands are naturally aligned: all register 
accesses must be aligned to a multiple of the register size. For example 4:p4, 8:m8 are 
allowed, 6:14 is not. 


Mixing operand domains 


Unless specifically allowed by the instruction, only the last operand can be of different 
domain to the others. For example, add 0:p2, 0:p2, 0:m2. If the preceding are mono, 
the last operand may only be mono or immediate. 


Mixture of operand types 


Types (unsigned, signed and float) of sources should be the same. The destination is also 
usually of the same type. 


Widths 


Widths must be nonzero. Operands must be of width 1, 2 or 4; mono operand widths must 
be at least 2. Floating point operands must have a width of 4 or 8. Immediate values are 
assumed to be of width 4, unless the width is specified. In general, all source operands must 
have the same width, unless otherwise specified. The destination width is based on the 
source widths (it is equal to the source width if not specified). 


Vector operations 


A small set of instructions operate on vectors: a vector is a contiguous set of registers 
holding related values. The purpose of vector operations is to make better use of internal 
pipelines and allow greater throughput of operations. 


Status register updates 


Instructions may either update the status register (mono or poly), leave it undefined or leave 
it unchanged. . The meaning of the mono and poly status flags are identical (although their 
respective bits are not guaranteed to be in the same place). 


For each operation, a calculated result, represented by res, is used to describe the effect on 
the status flags. res is assumed to be at least 1 bit larger than the actual destination so it can 
store the result of any overflow or underflow. In the calculation, any sources are expected to 
be extended (sign extended, if signed) or truncated to be the same width as the destination. 
For example, foradd dst:p2, src0:p2, srcl:pls, the calculated result, ves, would be 
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17 bits wide and srci is sign extended to 2 bytes. Table 2 shows how res is calculated for 
each of the operations. 


Operation res 


add dst, src0, srcl |src0O+srcl 

addc dst, src0, srcl|srcO + srcl + carry 

sub dst, src0, srcl |src0O +(notsrcl1)+1 
subc dst, src0, srcl}src0 + (not srcl) + carry 


neg dst, src (not srcl1) +1 


negc dst, src (not srcl1) + carry 
and dst, src0, srcel |src0O andsrcl 
or dst, src0O, srcl srcO orsrcl 


not dst, src0O, srcl |notsrcl 


xor dst, src0O, srcl |srcO xorsrcl 


Table 2. Calculation of res for operations 


The status flags are calculated from res and dst for these operations as shown in Table 3. 
In these descriptions, md is used to refer to the bit number of the most significant bit of the 
destination operand. 


Carry Zero MSB Overflow Negative 


Bit md+1 of res |dst == Bit md of res | carry xor carry2 MSB xor overflow 
where carry2 is the value that MSB would have 
if bit md was zero on both source operands 


Table 3. Calculation of status flags 


Note: For the operations, addc, subc and negc the status flags depend on the previous 
instruction as well as this instruction. The zero flag will be set by addc, subc and negc if it 
was already set to zero and the result of the operation is zero. Otherwise the zero flag will 
be unset. 


1.6 Poly instructions 


In general, instructions are poly if they have poly operands, and mono otherwise. Where this 
is not the case it is explicitly noted. If an instruction is poly then it will only execute if the PE 
is currently enabled, although some instructions can override this (these are known as 
forced instructions). 


a EY Temporary registers 


Macro instructions may require temporary mono or poly registers. These are reserved using 
the .setmonotemp and .setpolytemp assembler directives. The instruction set 
documentation describes the number of temporary registers needed for each instruction. By 
convention, these registers are allocated from the top of the register file. 
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1.8 


1.9 


Cycle counts 


The instruction set document includes cycle counts for each variant of the instruction. Some 
of these figures may be dependent on external factors such as external memory latency. 


Format of instruction documentation 


Each instruction mnemonic is described as a separate entry. Some instructions are split into 
several entries depending on their mix of operand domains or types. For instance, integer 
add and floating point add are described separately. The instructions are described using 
the fields in Table 4. 


Field 


Description Example 


Name 


Operands 


Description 


Constraints 


Side Effects 


Notes 


Details 


Name of instruction (may have further differentiation, | Isl (unsigned) 
for example, floating point, in brackets). 


Shows the operands for the instruction. Note, the dst, srcO, src 
names used are purely to aid understanding. 
This will be blank if the instruction has no operands. 


Short description of the function of the instruction. Typ-| dst = srcO << srci (unsigned left shift) 
ically in pseudo code. 


Specific constraints for this instruction. dst must have type unsigned 

dst must have equal width to srcO 
srcO must have type unsigned 

src1 must have type unsigned 

All operands must have type integer 


Any other effects this instruction has, for example, Leaves the status register in an undefined 
changing the status register. state 
Any extra information which may be useful; for exam- | 0 is shifted into the least significant bit of 


ple, a more detailed description of what the instruction | dst as it is shifted left. 
does. 


A table with more information for each valid combination of operands. This includes the number of 
cycles to execute the instruction, how many temporary registers are required and the instruction 
type. The instruction type will be one of: 

hardware: implemented directly in hardware 

microcode: implemented as a series of microcode steps 

macro: implemented as an assembler macro 


1.9.1 


Table 4. Instruction description format 


Instruction constraints 


Each instruction can have specific constraints. Some of these constraints are specific to 
individual operands or a relationship between operands. Other constraints are global to the 
whole instruction. Also note that the constraints section will be used to show where general, 
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instruction set-wide constraints are relaxed. Constraints usually fall into one of the following 


categories: 

@ A restriction on types allowed, such as float. 

@ Arestriction on domains or combination of domains. 

@ = A restriction on widths. Often, there is a restriction on the destination operand’s width 
based on the widths of the other operands. 

@ = Alignment (usually a relaxation of constraints). 

@ Overlapping of instruction operands. Some instructions allow a destination operand 
and a source operand to be overlapped, others do not. 

@ Poly instructions may require some minimum number of enable bits to be available. For 


example, a floating point mu1 instruction requires 3 bits free. 


1.9.2 Side effects 


Side effects are any state changed by the instruction, other than the destination operand. 


The side effects are related to the status register or enable state. Either the status register is 
left in an undefined state or the status register is updated (described in Section 1.5: Status 
register updates on page 8). The enable state is affected by, for example, comparison 
instructions. 
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2 Architecture Overview 


This chapter provides a brief overview of the multithreaded array processor (MTAP) 
architecture used in the CSX processor family, focusing on those aspects most relevant to 
programmers particularly when coding in a high-level language. 


2.1 Overview 


Figure 2 shows a high-level view of the multithreaded array processor core. Most 
components within the core will be familiar to users of other processors. Many of the 
software concepts are discussed in more detail in the SDK Reference Manual. 


v yy 


Instruction Data 
Cache Cache 


Mono Execution Unit 
Branch 


= : BE [ 
[instruction Fetch 
5  ——a a 


[[RegFie_ 
__sue 4g —— Loar 
= a] | _ 


Poly Execution Unit 


ALU 
FPU 
FPU 
MAC 


Reg File 


SRAM 


PIO 


Figure 2. Processor block diagram 
The main components of the processor are: 


@ Execution unit: 
This consists of two main parts: 


— | The mono execution unit which acts on mono (nonparallel or scalar) data and 
handles program flow control and I/O functions; 


— The poly execution unit which contains an array of Processing Elements (PEs) 
which act on poly (parallel) data. 


@ Control unit: fetches and decodes instructions. Instructions from a single instruction 
stream are executed by the appropriate part of the execution unit; 


@ Caches: instruction and data caches to speed accesses to external code and data; 


@ 1/0: as well as loads and stores from the mono and poly execution units, there are also 
input/output instructions used to transfer data between poly and mono memory. 
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2.2 


2.2.1 


It is the poly execution unit and its array of PEs that provide the processor’s massive 
processing power and memory bandwidth. The mono and poly execution units have 
basically the same architecture and instruction set. This tight integration of the mono and 
poly execution units mean that the processor as a whole is efficient for simple sequential 
code, as well as when processing large amounts of data in parallel. 


Having a single instruction stream and programming model for the processor also simplifies 
programming and makes it possible to provide an efficient high-level language compiler. 


Instruction set 


The processor has a fairly standard RISC-like instruction set. Most instructions can operate 
on mono or poly operands and are executed by the appropriate part of the execution unit. 


The instruction set provides a standard set of functions on both mono and poly execution 
units: 

Integer and floating point adds, subtracts, multiplies, divides, and so on. 

Logical operations: and, or, not, xor 

Arithmetic and logical shifts. 

Data comparisons: equal, not equal, greater than, less than, and so on. 

Data movement between registers 

@ Loads and stores: direct, indirect, indexed 


In addition, there are some instructions which are only relevant to either the mono or poly 
execution unit. For example, all program flow control is handled by the mono unit. 


CSX architecture 


Figure 3 on page 15 shows a more detailed view of the multithreaded array processor 
architecture. Each of the major functional units is described in more detail in the following 
sections. 


Control unit 


The control unit fetches instructions from memory, decodes them and dispatches them to 
the mono or poly execution units as appropriate. 


The controller provides hardware support for multithreaded code. This is a vital part of the 
architecture for achieving the performance potential of the processor. Because of the highly 
parallel architecture of the poly execution unit, there can be significant latencies if, for 
example, all PEs need to read or write external data. On the other hand, it is important to 
ensure that the array is always working to achieve maximum efficiency. The multithreaded 
architecture addresses both these issues by supporting very rapid task switching which 
allows I/O to be efficiently overlapped with processing. This serves to hide the latency of 
accesses and keep the PE array busy. 


Threads are prioritized: the processor will run the highest priority thread that is ready to run. 
A higher priority thread can preempt a lower priority thread when it becomes ready to run. 
Threads are synchronized — with each other and with hardware such as I/O engines — via 
semaphores. 


In the simplest case, a program would have two threads: one for I/O and one for compute. 
By pre-fetching data in the I/O thread the code can ensure it has arrived in the PEs when it 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Architecture Overview 


Instruction Fetch 


Load/Store 
128 
Instruction Data 
Cache Cache 
(8 Kbytes) (4 Kbytes) 
H28 
Control Unit 
Instruction 
S| Register Fetch 
Control (| Read/Write 
128 
32 
v 
$2) Interrupt S fi Instruction 
IRQ ¢ Generation Mj PeMaphores re Decode ALU Load/Store 
" | he] fie] ee. J fe... 
I 32 
|| Register File (64 bytes) 
i _|_Result__| Per Thread | 
Y Yv Ben Registers 
PIO Load/Store Aray [| | ( SUES) 
Controller Controller Controller Mong Execution Unit 
v Poly Execution Unit 
MAC 
FP Mul 
FP Add 
FP + 
ALU 
| +] | E 
(0007) A—_| | Register File / ooo} 
64 7 (128 bytes) NT -Eq7 ~ 64 
PE PE i, | SRAM PE is PE 4 
(6 Kbytes) 
32 
ie) 
Buffer 
(64 bytes) 
Vv 128] | 
PI 
PIO( 28 nan . 


Figure 3. Processor architecture 


is required and the array can run without stalling. It is quite possible for a compiler to 
perform this sort of optimization on compiled code. 


Threads do not have to be explicitly created by the programmer, if the standard library 
functions for input and output are used. The standard library functions for input and output 
use threads to allow the data transfer to operate concurrently with processing. 


2.2.2 Execution units 


This section outlines the common aspects of the poly and mono execution units. The 
specifics of each execution unit are described later. 
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Arithmetic-Logic Unit (ALU) 


Arithmetic operations are provided in several versions including integer operations on 
signed and unsigned data of various sizes, and floating point operations. 


Register files 


The mono and poly register files can be addressed very flexibly as registers of various 
sizes. The contents of the register file can be treated as an array of bytes, where a 
contiguous set of bytes can be treated as a single register. The registers are specified in a 
consistent way using byte addresses and widths specified in bytes. 


As an example, the first byte of the poly register file (address 0) can be used as a one-byte 
register (referred to as 0:p1), the first two bytes can be used as a two-byte register (0 : p2) 
or the first four bytes can be used as a four-byte register (0: p4). 


There are some constraints on the way the register file can be used. The main constraint is 
that all registers must be naturally aligned, that is, the register address must be a multiple of 
the register size. There may be other constraints, which can vary depending on the specific 
architecture being targeted. For example, the size of mono registers must be two bytes or 
larger. 


2-byte 16°17 18 °° 19° 

register 12 ja ae" 
NW 1 

9 | 10 : 11 7—~ _ 2-byte register 10 

1-byte——_™ 4] 5°: 6 | 7 +—~ _ 1-byte register 7 
1 


register 4 0 2 2 4-byte register 0 
LSB MSB 


Figure 4. Register file addressing 


In addition to the general purpose registers described above, the mono execution unit 
contains a number of specialized control and result registers. 


Status register 

@ The status register contains five status bits that provide information about the result of 
the last ALU or FPU operation. When set, these bits indicate: 

Carry generated or FPU inexact 

Zero result 

Most significant bit set or FPU underflow 

Overflow generated 

Negative result 


Addressing modes 


Load and store instructions transfer data between memory and registers. In the case of the 
mono execution unit, these transfer data to and from external memory. Poly loads and 
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stores, on the other hand, transfer data between the PE register file and PE memory. Data is 
transferred between the PEs and external memory using I/O instructions (see Section 
Section 2.2.5: I/O mechanisms on page 21). 


There are a number of addressing modes for loads and stores which can be used for both 
mono and poly data. These are: 


Direct: The address to be read/written is specified as an immediate value. 
Indirect: The address is specified in a register. 
Offset: The address is calculated by adding an offset to a direct or indirect base address. 


Indexed: On the PEs, an optional poly index register can be added to the address. 


The combinations of addressing modes and domains are summarized in Table 5. 


operation | destination | address + offset + index 
mono mono 
immediate 
mono 
immediate mono 
immediate 
ld/st 
mono poly 
immediate 
poly 
poly 
immediate 
poly 
poly 
immediate 


Table 5. Load/store addressing modes 
Loads and stores must be naturally aligned — that is, the address must be a multiple of the 
size of the data. 


Conditional code 


The main difference between code for the mono and the poly execution units is in the 
handling of conditional execution. The same set of arithmetic and comparison instructions 
are available on both processing units, but the way the results are handled is different. 


The mono unit uses conditional jumps to branch around code, based on the result a 
comparison or of previous instructions. This means that mono conditions affect both mono 
and poly operations. 
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The poly unit uses a set of enable bits (described in more detail below) to control whether 
each PE has its state changed by executing instructions. Therefore poly conditional code 
only affects poly processing. 


The mono execution unit can test the enable state of the PEs. This can be used, for 
example, to branch over code where all of the PEs are disabled and so avoid fetching code 
which will not be used by any of the PEs. 


Mono execution unit 


The mono execution unit is a 16-bit, RISC-like processing unit consisting of: 
e An arithmetic-logic unit (ALU) 
e Multiply-accumulate (MAC) unit 
e¢ 64-bit floating point unit (FPU) 
e A general purpose register file 
e Status, control and result registers 


As well as handling mono data, the mono unit is responsible for program flow control 
(branching, and so on), thread switching and other control functions. The mono execution 
unit also has overall control of I/O operations of poly data from the poly execution unit. 
Results from these operations are returned to a result register in the mono unit. 


Registers 


The mono execution unit contains (or provides access to) several sets of registers. These 
include the normal data registers and a number of control and status registers. To ensure 
that they are thread safe, some of these registers are multithreaded, that is, the values do 
not have to be saved and restored across threads. 


In addition to the registers described here, the mono execution unit also has access to the 
control registers for the I/O controllers and engines. 


Status registers 


The mono status register holds the status bits described above and also contains a number 
of user predicate bits. These can be useful to evaluate conditional expressions without 
using up general purpose registers. 


The predicates can be set and cleared by the programmer. In addition, a set of logical 
operations between predicate bits are defined. Because the status bits are simply predicate 
bits with predefined meanings, they can also be used in these operations. The relationship 
between status bits and predicates is shown in Table 6 on page 19. 


Control registers 


Each thread maintains a program counter (PC) register for fetching instructions in that 
thread. Each thread also has a register which holds the return address of the most recent 
function/procedure call. 


Result registers 


There are a set of result registers available to the mono execution unit. Some of these have 
predefined functions, the others are available for the programmer to get the results of I/O 
operations, for example. 
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Predicate | Meaning 
Number 


oO 


Carry or FPU Inexact 


id 


Zero 
2) MSB or FPU Underflow 
3| Overflow 

4| Negative 

5| True: always set 


6-15| User defined 


Table 6. Assignment of status bits to predicates 


The result registers are mostly multithreaded. The exception is the cycle count register; this 
is incremented every cycle and provides support for profiling. 


Conditional execution 


The mono execution unit handles conditional execution in the same way as a traditional 
processor. A set of conditional jump instructions use various values to determine whether to 
take a branch or not. The value tested can be: 

@ A value in a register: branch on zero or nonzero 

@ The result of a comparison: for example, branch if a register is greater than some value 


@ The value of the status bits: for example, branch if a previous operation set the carry 
flag 


@ The value of a predicate: branch if a specific predicate bit is zero or nonzero 


Semaphores 


Threads are synchronized with one another and with hardware resources, such as I/O 
channels, by means of semaphores. 


Semaphores are special registers that can be incremented or decremented with atomic 
(noninterruptible) operations called signal and wait. A signal operation increments a 
semaphore. A wait operation decrements a semaphore unless the semaphore value is 0, in 
which case the wait will stall until the semaphore is signalled. 


Semaphores can be controlled by software and by hardware units (such as I/O controllers) 
in the processor. 


2.2.4 Poly execution unit 
The poly unit implements an advanced Single Instruction Multiple Data (SIMD) architecture 
which, combined with the mono execution unit and the multithreaded controller, provides the 
high performance, low power, flexibility and ease of programming associated with the 
architecture. The poly execution unit is basically an array of 10s, 100s or 1,000s of 
processing elements (PEs). 
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Each PE (Figure 5) consists of: 


An arithmetic-logic unit (ALU) 
Multiply-accumulate (MAC) unit 
64-bit floating point unit (FPU) 
A general purpose register file 
Status and enable registers 

A block of memory 

An inter-PE communication path 
@ |/O channels 


Load and store instructions move data between a PE’s register file and memory, while the 
ALU operates on data in the register file. Data is transferred in to and out of the array using 
the I/O instructions. 


D 
rs DIV/SQRT 


PE SPE 


Register File 


32 


my §=Programmed I/O 
haa 


|— PE Memory 


Figure 5. Processing element architecture 


ALU 


The ALU is capable of performing a set of arithmetic operations on values held in the PE 
register file. 


The performance of some of these functions will depend on whether they are performed 
purely in software or whether one of the ALU extensions is included in the target processor. 
Examples of ALU extensions are the Floating Point Unit (FPU) and the Multiply-Accumulate 
(MAC) unit. 


Status register 


The ALU, and FPU or MAC if present, will set status bits indicating the result of the 
operations. 
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Each of these functions has a separate status register. The bits in these are shown in 
Table 7. 


Bit ALU & MAC FPU Status 
Number _| Status 
0} MSB Underflow 
1] Carry Inexact 
2) Overflow Overflow 
3} Negative Negative 
4| Zero Zero 


Table 7. Assignment of poly status bits 


Conditional behavior 


The SIMD nature of the PE array prohibits each PE having its own branch unit (branching 
being handled by the mono execution unit). Instead, each PE can control whether its state 
should be updated by any following instructions by enabling or disabling itself; this is similar 
to the predicated operations in some RISC CPUs. 


Enable state 


A PE’s enable state is determined by the bits in the enable register. If all these bits are set, 
then a PE is enabled, and executes instructions normally. If one or more of the enable bits is 
zero (clear), then the PE is disabled and most instructions it receives will be ignored 
(instructions on the enable state, for example, are not disabled). 


The enable register is treated as a stack, and new bits can be pushed onto the top of the 
stack. This is usually the result of a conditional instruction. The result of a test, either a 1 or 
a 0, is pushed onto the enable stack. This bit can later be popped from the top of the stack 
to remove the effect of that test result. This makes handling nested conditions and loops 
very efficient. 


Instructions 


Conditional execution on the poly unit is supported by a set of poly conditional instructions: 
if, else, endif, and so on. These manage the enable bits to allow different PEs to 
execute the appropriate branch of an if...else construct in C, for example. These also 
support nested conditions. 


Forced loads and stores 


The poly loads and stores are normally controlled by the enable state of the PE. However, 
because there are instances where it is necessary to load and store data regardless of the 
current enable state, the instruction set includes forced loads and stores. These will affect 
the state of a PE even if it is disabled. 


2.2.5 1/0 mechanisms 
Programmed I/O (PIO) allows the PEs to perform loads and stores to mono memory; it 
transfers data between PE memory and external memory. Each PE can choose to take part 
in an I/O operation or not, and each PE can provide different addresses in poly and external 
memory for transfers. 
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The PIO mechanism provides two addressing modes: 


Direct addressed mode where each PE calculates the destination or source address 
explicitly. 


Strided mode where each PE uses an address offset from a base address. The 
address for a given PE is incremented each time an enabled PE is encountered. This 
results in the ability for the PEs to write to or read from a contiguous area of mono 
memory even if some of them are disabled. 


Access to external memory can be consolidated, for example, multiple reads from the same 
address are combined into a single memory access and the data distributed to all the PEs 
that require it. This can make a significant difference to the effective memory bandwidth. 


In order to make maximum use of the processor it is normal to perform I/O concurrently with 
computation. This is one of the main uses of the multithreaded architecture. By using a 
separate thread for I/O operations, they can run asynchronously with the processing tasks: 
running when there is data to be fetched and synchronizing with the compute threads via 
semaphores. A set of C libraries are provided to perform asynchronous I/O. 


Swazzle 


Finally, the PEs are able to communicate with one another via what is known as the swazzle 
path that connects the register file of each PE with the register files of its left and right 
neighbors. This allows PE, to perform a register-to-register transfer to either its left or right 
neighbor, PE,.1, or PE,+1, while simultaneously receiving data from the other neighbor. 


Swazzle instructions use multiples of 2 bytes in their arguments, so the source and 
destination registers must be 2-byte aligned. Instructions are provided to shift data left or 
right through the array, and to swap data between adjacent PEs. 


The enable state of a PE affects its participation in a swazzle operation in the following way: 
if a PE is enabled, then its registers may be updated by a neighboring PE, regardless of the 
neighboring PE’s enable state. Conversely, if a PE is disabled, its register file may not be 
altered by a neighbor under any circumstance. A disabled PE will still provide data to an 
enabled neighbor. 


The data written into the registers of the PEs at the ends of the swazzle path can be set in 
registers in the mono execution unit. Similarly, the value shifted out of the end of the 
swazzle path can be read from a result register in the mono execution unit. 
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Instruction set description 


The instruction set is split into 3 main sections. The high level section describes the 
instructions which are stable and require a less detailed understanding of the architecture. 
They are a mixture of macro, hardware and microcoded instructions. The low level section 
consists of hardware and microcoded instructions and is intended primarily to help 
understand the code generated from the high level section, but is less safe to use directly 
and may change. These require a more detailed understanding of the architecture. The 
specialist section contains instructions which accelerate specific applications. 


3.1 High level instructions 
These instructions are stable and require a less detailed understanding of the architecture. 
They are a mixture of macro, hardware and microcoded instructions. 

3.1.1 Move and cast instructions 
These instructions simply move or convert from one type to another. Macro move and cast 
instructions can be used by including move.inc and cast.inc respectively. 

Document No. 06-RM-1137 Revision: 3.A 23 


ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


mov 


24 


Operands: dst, src 


Description: 


dst = src (bytewise copy) 
Constraints: 
e dst has maximum width of 32. 


Notes: The 4 byte poly move instruction requires that the source and destination be 
similarly aligned in an oct, otherwise the instructions is a macro and breaks down in to two 2 
byte poly moves. The first entry in the below table for 4 byte poly moves is for when they are 
similarly aligned and the second begin for when they are not 


Details: 

dst src cycles latency temporaries | comments 
m2 m2 1 1 hardware 
m2 i12 2 2 hardware 
m4 m4 2 2 macro 

m4 i14 4 4 macro 

m8 m8 4 4 macro 

m8 i18 8 8 macro 
m16 m16 8 8 macro 
m16 i116 16 16 macro 
m32 m32 16 16 macro 
m32 i132 32 32 macro 

pl pl 1 2 microcode 
pl m2 1 2 microcode 
pl iid 1 2 microcode 
p2 p2 2 3 microcode 
p2 m2 2 3 microcode 
p2 i12 2 3 microcode 
p4 p4 2 2 microcode 
p4 m4 4 5 macro 

p4 is 4 5 macro 

p4 p4 4 5 macro 

ps8 p8 2 2 microcode 
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Details: 

dst src cycles latency temporaries | comments 
p8 m8 8 9 macro 

ps8 il18 8 9 macro 

p16 p16 4 4 macro 

p16 m16 16 17 macro 

p16 i116 16 17 macro 
Daz Daz 8 8 macro 

p32 m32 32 33 macro 

p32 L132 32 33 macro 
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Operands: dst, src 


Description: 


dst = src (bytewise copy). 


Constraints: 


e All operands have equal width to 


e dst has maximum width of 32. 


regardless of enable state 


Details: 

dst src cycles latency temporaries | comments 
pl pl 1 2 microcode 
pl ill 1 2 microcode 
p2 p2 2 3 microcode 
p2 m2 2 3 microcode 
p2 i12 2 3 microcode 
p4 p4 2 2 microcode 
p4 m4 4 5 macro 

p4 i14 4 5 macro 

p8 ps8 2 2 microcode 
p8 m8 8 9 macro 

p8 il18 8 9 macro 
p16 m16 16 17 macro 
p16 LLALG 16 17 macro 
p32 m32 32 33 macro 
p32 1132 32 33 macro 
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dup 


Operands: dst, src 


Description: 


duplicates the byte in src to every byte in dst 
Constraints: 

e All operands have domain poly. 

e src has width of 1. 


Details: 

dst src cycles latency temporaries | comments 
p2 pl 1 2 microcode 
p4 pl 1 2 microcode 
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fdup 


Operands: dst, src 
Description: 


duplicates the byte in src to every byte in dst, regardless of 
enable state 
Constraints: 


e All operands have domain poly. 
e src has width of 1. 


Details: 

dst src cycles latency temporaries | comments 
p2 pl 1 2 microcode 
p4 pl 1 2 microcode 
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cast (integer 4 byte mono to integer 2 byte mono) 


Operands: dst, src 


Description: 


Converts src to dst, taking the least significant 2 bytes of src. 
Constraints: 

e All operands have type integer. 

e All operands have domain mono. 

e dst has width of 2. 

e src has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
m2us m4us 1 1 hardware 
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Operands: dst, src 


Description: 


Converts src to dst, taking the least significant 2 bytes of src. 
Constraints: 

e All operands have type integer. 

e All operands have domain poly. 

e dst has width of 2. 

e src has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
p2us p4us 2 3 microcode 
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cast (integer 4 byte poly to integer 1 byte poly ) 


Operands: dst, src 


Description: 


Converts src to dst, taking the least significant byte of src. 
Constraints: 

e All operands have type integer. 

e All operands have domain poly. 

e dst has width of 1. 

e src has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
plus p4us 1 2 microcode 
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Operands: dst, src 


Description: 


Converts src to dst, extending it to 4 bytes. 
Constraints: 

e All operands have type integer. 

e All operands have domain mono. 

e dst has width of 4. 

e src has width of 2. 
Side Effects: 

e Leaves the status register in an undefined state. 


Notes: If src is signed, then it is sign extended into dst, otherwise it is zero extended 
Details: 


dst 


src 


cycles 


latency 


temporaries 


comments 


m4us 


m2us 


macro 
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cast (integer 2 byte poly to integer 4 byte poly) 


Operands: dst, src 


Description: 


Converts src to dst, extending it to 4 bytes. 
Constraints: 

e All operands have type integer. 

e All operands have domain poly. 

e dst has width of 4. 

e src has width of 2. 
Side Effects: 

e Leaves the status register in an undefined state. 


Notes: If src is signed, then it is sign extended into dst, otherwise it is zero extended 


Details: 

dst src cycles latency temporaries | comments 

p4us p2us 4 5 microcode 
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Operands: dst, src 


Description: 


Converts src to dst, taking the least significant byte of src. 


Constraints: 
e All operands have type integer. 
e All operands have domain poly. 
e dst has width of 1. 
e src has width of 2. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
plus p2us 1 macro 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


cast (integer 1 byte mono to integer 2 byte mono ) 


Operands: dst, src 


Description: 


Converts src to dst, extending src. 
Constraints: 

e All operands have type integer. 

e All operands have domain mono. 

e dst has width of 2. 

e src has width of 1. 
Side Effects: 

e Leaves the status register in an undefined state. 


Notes: If src is signed, then it is sign extended into dst, otherwise it is zero extended. This 
is useful after a one byte mono load. 


Details: 

dst src cycles latency temporaries | comments 

m2u mlu 1 1 hardware 

m2s mis 6 6 macro 
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Operands: dst, src 


Description: 


Converts src to dst, 


Constraints: 


extending src. 


e All operands have type integer. 


e All operands have domain poly. 


e dst has width of 2. 
e src has width of 1. 


Side Effects: 


e Leaves the status register in an undefined state. 


Notes: If src is signed, then it is sign extended into dst, otherwise it is zero extended. 


Details: 

dst src cycles latency temporaries | comments 
p2u plu 2 macro 

p2s pls 3 microcode 
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cast (integer 1 byte mono to integer 4 byte mono ) 


Operands: dst, src 


Description: 


Converts src to dst, extending src. 
Constraints: 

e All operands have type integer. 

e All operands have domain mono. 

e dst has width of 4. 

e src has width of 1. 
Side Effects: 

e Leaves the status register in an undefined state. 


Notes: If src is signed, then it is sign extended into dst, otherwise it is zero extended 


Details: 

dst src cycles latency temporaries | comments 

m4u mlu 2 2 macro 

m4s mis 8 8 macro 
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Operands: dst, src 


Description: 


Converts src to dst, extending src. 
Constraints: 

e All operands have type integer. 

e All operands have domain poly. 

e dst has width of 4. 

e src has width of 1. 
Side Effects: 

e Leaves the status register in an undefined state. 


Notes: If src is signed, then it is sign extended into dst, otherwise it is zero extended 


Details: 

dst src cycles latency temporaries | comments 
p4u plu 4 5 macro 

p4s pls 7 8 macro 
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cast (float to 4 byte integer) 


Operands: dst, src 


Description: 


Converts float sre to integer dst. 
Constraints: 

e dst has type integer. 

e dst has width of 4. 

e src has type float. 

e src has width of 4. 

e src has not a domain of label. 


Side Effects: 

e Requires up to 2 levels of enable stack. 
Details: 
dst src cycles latency temporaries | comments 
m4us m4f hardware 
p4us p4f microcode 
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cast (4 byte integer to float) 


Operands: dst, src 


Description: 


Converts integer src to float dst. 
Constraints: 

e dst has width of 4. 

e dst has type float. 

e src has type integer. 

e src has width of 4. 

e src has not a domain of label. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
m4f m4us 4 4 hardware 
p4f p4us 4 5 microcode 
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cast (4 byte integer to 64 bit float) 


Operands: dst, src 


Description: 


Converts integer src to 64 bit float dst. 
Constraints: 
e dst has type float. 
e dst has width of 8. 
e src has type integer. 
e src has width of 4. 
e src has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 2 levels of enable stack. 


Details: 

dst src cycles latency temporaries | comments 

m8f m4us 4 4 hardware 

psf p4us 4 5 microcode 
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cast (64 bit float to 4 byte integer) 


Operands: dst, src 


Description: 


Converts 64 bit float src to integer dst. 
Constraints: 

e dst has type integer. 

e dst has width of 4. 

e src has type float. 

e src has width of 8. 

e src has not a domain of label. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
m4us m8£ 4 4 hardware 
p4us psf 4 5 microcode 
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cast (64 bit float to 32 bit float) 


Operands: dst, src 


Description: 


Converts 64 bit float src to 32 bit float dst. 
Constraints: 

e dst has type float. 

e dst has width of 4. 

e src has type float. 

e src has width of 8. 

e src has not a domain of label. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
m4f m8£ 4 4 hardware 
p4f pst 4 5 microcode 
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cast (32 bit float to 64 bit float) 


Operands: dst, src 


Description: 


Converts 32 bit float src to 64 bit float dst. 
Constraints: 

e dst has type float. 

e dst has width of 8. 

e src has type float. 

e src has width of 4. 

e src has not a domain of label. 
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Side Effects: 

e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
m8f m4f 4 hardware 
psf p4f 5 microcode 
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poly.to.mono.and 


Operands: dst, src 


Description: 


Combines all 
in mono register dst. 
result will 


be all 1s (Ox00ff) 


Constraints: 


dst has domain mono. 
dst has width of 2. 
src has domain poly. 
src has width of 1. 


Notes: There will be an additional delay on the real hardware of 10-20 cycles, as the 
information filters through the hardware pipeline, before the result is available to the user. 


Details: 
dst src cycles latency temporaries | comments 
m2 pl 13 13 macro 
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Operands: dst, src 


Description: 


Combines all enabled poly src bytes using bitwise AND putting result 
in mono register dst. If all processing elements are disabled, the 


result will be all ls (binary). 


Constraints: 
e All operands have equal width to 
e dst has domain mono. 
e src has domain poly. 


Notes: There will be an additional delay on the real hardware of 10-20 cycles, as the 
information filters through the hardware pipeline, before the result is available to the user. 


Details: 

dst src cycles latency temporaries | comments 
m2 p2 21 21 macro 

m4 p4 42 42 macro 
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3.1.2 Arithmetic instructions 


These instructions perform add, subtract, neg, multiply and divide operations. Some of 
these instructions allow the carry flag to be added in. Macro arithmetic instructions can be 
used by including arith.inc. 
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add (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO + srcel. 
Constraints: 

e All operands have equal width to 

e All operands have type integer. 
Side Effects: 

e Updates the status register. 
Notes: This can be chained together with 
addc 


to make a larger add. For example 


add 0:p2, 2:p2, 4:p2 

is equivalent to 

add O:pl, 2:pl, 4:pl 

followed by 

addc 1l:pl, 3:pl, 5:pl 

(little endian). 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
m4us m4us m4us 2 2 macro 
m4us m4us il4us 4 4 macro 
plus plus plus 1 2 microcode 
plus plus m2us 1 2 microcode 
plus plus ILLS 1 2 microcode 
p2us p2us p2us 2 3 microcode 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
p4us p4us p4us 4 5 microcode 
p4us p4us m4us 4 5 macro 
p4us p4us il4us 4 5 macro 
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add (floating point 32 bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO + srcel. 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e srci has not a domain of label. 


Notes: The mono version of this instruction will set the predicates. The negative and zero 
flag will map onto the normal status flags. See the reference manual for more information on 
how the other predicates are set. The poly version will set a seperate floating point add 
status register. This register can be accessed by the instruction status.fpadd.get and also 
the instructions prefixed with 


if.fpadd 
and 
andif.fpadd 


. There is no way to restore the poly floating point status, so it is advisable to only do poly 
floating point operations in a single thread. 


Details: 

dst src0 srce1 cycles latency temporaries | comments 
m4f£ m4f m4f 4 4 hardware 

m4ft m4f i4f 8 8|4 mono macro 

p4f p4t p4t 4 5 microcode 
p4ft p4ft i4f 8 9| 4 poly macro 
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add (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO + srcel. 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e srci has not a domain of label. 


Notes: The mono version of this instruction will set the predicates. The negative and zero 
flag will map onto the normal status flags. See the reference manual for more information on 
how the other predicates are set. The poly version will set a seperate floating point add 
status register. This register can be accessed by the instruction status.fpadd.get and also 
the instructions prefixed with 


if.fpadd 
and 
andif.fpadd 


. There is no way to restore the poly floating point status, so it is advisable to only do poly 
floating point operations in a single thread. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m8f m8f£ m8f 4 4 hardware 

m8f m8 f i8f 10 10) 8 mono macro 

pst pst psf 4 5 microcode 

psf psf i8f 12 13) 8 poly macro 
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addc (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO + srcl + carry. 
Constraints: 

e All operands have equal width to 

e All operands have type integer. 
Side Effects: 

e Updates the status register. 
Notes: This can be chained together with 
add 
and 
addc 


to make a larger add. For example 


add 0:p2, 2:p2, 4:p2 
is equivalent to 

add O:pl, 2:pl, 4:pl 
followed by 

adde lipl; 3S:pl, ds:pl 


(little endian). 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
m4us m4us m4us 2 2 macro 
m4us m4us il4us 4 4 macro 
plus plus plus 1 2 microcode 
plus plus m2us 1 2 microcode 
plus plus illus 1 2 microcode 
p2us p2us p2us 2 3 macro 
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CSX600 Instruction Set 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p2u p2us m2us 2 3 microcode 
p2u p2us il2us 2 3 microcode 
p4u p4us p4us 4 5 macro 

p4u p4us m4us 4 5 macro 

p4u p4us il4us 4 5 macro 
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Instruction set description 


sub (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO - srcl. 
Constraints: 

e All operands have type integer. 

e All operands have equal width to 
Side Effects: 

e Updates the status register. 


Notes: This can be chained together with 


subc 

to make a larger sub. For example 

sub 0:p2, 2:p2, 4:p2 

is equivalent to 

sub O:pl, 2:pl, 4:pl 

followed by 

sube l:pl, 3:pl, S:pl 

(little endian). 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
m4us m4us m4us 2 2 macro 
m4us m4us il4us 4 4 macro 
plus plus plus 1 2 microcode 
plus plus m2us 1 2 microcode 
plus plus ILLS 1 2 microcode 
p2us p2us p2us 2 3 microcode 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
p4us p4us p4us 4 5 microcode 
p4us p4us m4us 4 5 macro 
p4us p4us il4us 4 5 macro 
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sub (floating point 32 bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO - srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e srci has not a domain of label. 


Notes: The mono version of this instruction will set the predicates. The negative and zero 
flag will map onto the normal status flags. See the reference manual for more information on 
how the other predicates are set. The poly version will set a seperate floating point add 
status register. This register can be accessed by the instruction status.fpadd.get and also 
the instructions prefixed with 


if.fpadd 
and 
andif.fpadd 


. There is no way to restore the poly floating point status, so it is advisable to only do poly 
floating point operations in a single thread. 


Details: 

dst src0 srce1 cycles latency temporaries | comments 
m4f£ m4f m4f 4 4 hardware 

m4ft m4f i4f 8 8|4 mono macro 

p4f p4t p4t 4 5 microcode 

p4ft p4ft i4f 8 9| 4 poly macro 
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sub (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO - srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e srci has not a domain of label. 


Notes: The mono version of this instruction will set the predicates. The negative and zero 
flag will map onto the normal status flags. See the reference manual for more information on 
how the other predicates are set. The poly version will set a seperate floating point add 
status register. This register can be accessed by the instruction status.fpadd.get and also 
the instructions prefixed with 


if.fpadd 
and 
andif.fpadd 


. There is no way to restore the poly floating point status, so it is advisable to only do poly 
floating point operations in a single thread. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m8f m8f£ m8f 4 4 hardware 

m8f m8 f i8f 10 10) 8 mono macro 

pst pst psf 4 5 microcode 

psf psf i8f 12 13) 8 poly macro 
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subc (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO - srcel - 1 + carry. 
Constraints: 

e All operands have type integer. 

e All operands have equal width to 
Side Effects: 

e Updates the status register. 


Notes: This can be chained together with 


to make a larger sub. For example 


sub 0:p2, 2:p2, 4:p2 
is equivalent to 
sub O:pl, 2:pl, 4:pl 


followed by 


sube i:pl,; 3:pl, ds2pl 


(little endian). 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
m4us m4us m4us 2 2 macro 
m4us m4us il4us 4 4 macro 
plus plus plus 1 2 microcode 
plus plus m2us 1 2 microcode 
plus plus illus 1 2 microcode 
p2us p2us p2us 2 3 macro 
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Details: 
dst srcO src1 cycles latency temporaries | comments 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
p4us p4us p4us 4 5 macro 
p4us p4us m4us 4 5 macro 
p4us p4us il4us 4 5 macro 
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neg (integer) 


Operands: dst, src 


Description: 


dst = -src. 
Constraints: 

e All operands have domain mono or poly. 

e All operands have type integer. 

e All operands have equal width to 
Side Effects: 

e Updates the status register. 
Notes: This can be chained together with 
negc 
to make a larger neg. For example 
neg O0:p2, 2:p2 
is equivalent to 
neg O:pl, 2:pl 
followed by 
negc l:pl, 3:pl 


(little endian). 


Details: 
dst src cycles latency temporaries | comments 
m2us m2us 1 1 hardware 
m4us m4us 2 2 macro 
plus plus 1 2 microcode 
plus m2us 1 2 microcode 
p2us p2us 2 3 microcode 
p2us m2us 2 3 microcode 
p4us p4us 4 5 macro 
p4us m4us 4 5 macro 
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neg (floating point 32 bit) 


Operands: dst, src 


Description: 


dst = -src. 

Constraints: 
e All operands have type float. 
e All operands have width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
m4£ m4f 2 2 macro 
p4f p4f 4 5 macro 
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neg (floating point 64 bit) 


Operands: dst, src 


Description: 


dst = -src. 

Constraints: 
e All operands have type float. 
e All operands have width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
dst src cycles latency temporaries | comments 
m8f m8f 4 4 macro 
psf psf 6 7 macro 
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negc (integer) 


Operands: dst, src 


Description: 


dst = -sre - 1 + carry. 
Constraints: 

e All operands have domain mono or poly. 

e All operands have type integer. 
Side Effects: 

e Updates the status register. 
Notes: This can be chained together with 
neg 
and 
negc 
to make a larger neg. For example 
neg 0:p2, 2:p2 
is equivalent to 
neg O:pl, 2:pl 
followed by 
negc l:pl, 3:pl 


(little endian). 


Details: 

dst src cycles latency temporaries | comments 

m2us m2us 1 1 hardware 

m4us m4us 2 2 macro 

p2us p2us 2 3 microcode 

p2us m2us 2 3 microcode 

p4us p4us 4 5 macro 

p4us m4us 4 5 macro 
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mul (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO * srcl. 
Constraints: 
e All operands have equal width to 
e All operands have type integer. 
e dst has width of 2. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 


Notes: The macro versions of these instruction, denoted in the below cycle table with the 
macro comment, do not allow the overlapping of the destination and sources. The hardware 
and microcode instructions however, allow the destination and sources to overlap. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m2u m2u 3 3 hardware 
m2u m2u i2u 3 3 hardware 
m2s m2s m2s 3 3 hardware 
m2s m2s i2s 3 3 hardware 
p2u plu plu 4 5 microcode 
p2u plu m2u 5 6} 1 poly macro 

p2u plu hw 5 6] 1 poly macro 
p2s pls pls 4 5 microcode 
p2s pls m2s 5 6] 1 poly macro 

p2s pls ils 5 6] 1 poly macro 
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Instruction set description 


mul (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO * srcl. 
Constraints: 
e All operands have type integer. 
e dst cannot overlap with the other operands. 
e dst has width equal to the combined width of the other operands. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 


Notes: The macro versions of these instruction, denoted in the below cycle table with the 
macro comment, do not allow the overlapping of the destination and sources. The hardware 
and microcode instructions however, allow the destination and sources to overlap. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m4u m2u m2u 3 3 hardware 
m4u m2u i2u 3 3 hardware 
m8u m4u m4u 21 21/4 mono macro 
m8u m4u i4u 21 21/4 mono macro 
m4s m2s m2s 3 3 hardware 
m4s m2s i2s 3 3 hardware 
m8s m4s m4s 28 28|6 mono macro 
m8s m4s i4s 28 28|6 mono macro 
p2u plu plu 4 5 microcode 
p2u plu m2u 5 6] 1 poly macro 
p2u plu ilu 5 6] 1 poly macro 
p4u p2u p2u 4 5 microcode 
p4u p2u m2u 6 7|2 poly macro 
p4u p2u i2u 6 7|2 poly macro 
psu p4u p4u 7 8 microcode 
p8u p4u m4u 11 12) 4 poly macro 
p8u p4u idu 11 12) 4 poly macro 
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CSX600 Instruction Set 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p2s pls pls 4 5 microcode 
p2s pls m2s 5 6] 1 poly macro 

p2s pls ils 5 6] 1 poly macro 
p4s p2s p2s 4 5 microcode 
p4s p2s m2s 6 7|2 poly macro 

p4s p2s i2s 6 7|2 poly macro 

p8s p4s p4s 7 8 microcode 
pgs p4s m4s 11 12) 4 poly macro 
p8s p4s ids 11 12) 4 poly macro 
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mul (floating point) 


Operands: dst, src0, srcil 


Description: 


dst = srcO * srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 


Notes: The mono version of this instruction will set the predicates. The negative and zero 
flag will map onto the normal status flags. See the reference manual for more information on 
how the other predicates are set. The poly version will set a seperate floating point mul 
status register. This register can be accessed by the instruction status.fpmul.get and also 
the instructions prefixed with 


if.fpmul 
and 
andif.fpmul 


. There is no way to restore the poly floating point status, so it is advisable to only do poly 
floating point operations in single thread. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m4f m4f£ m4f 4 4 hardware 

m4ft m4f i4f 8 8|4 mono macro 

p4ft p4t p4t 4 5 microcode 
p4ft p4ft i4f 8 9| 4 poly macro 
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mul (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO * srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e srci has not a domain of label. 


Notes: The mono version of this instruction will set the predicates. The negative and zero 
flag will map onto the normal status flags. See the reference manual for more information on 
how the other predicates are set. The poly version will set a seperate floating point mul 
status register. This register can be accessed by the instruction status.fpmul.get and also 
the instructions prefixed with 


if.fpmul 
and 
andif.fpmul 


. There is no way to restore the poly floating point status, so it is advisable to only do poly 
floating point operations in single thread. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m8f m8f£ m8f 4 4 hardware 

m8f m8 f i8f 10 10) 8 mono macro 

pst pst psf 4 5 microcode 

psf psf i8f 12 13) 8 poly macro 
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mul.lo (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO * srcl (least significant bytes of). 
Constraints: 
e All operands have type integer. 
e All operands have equal width to 
e dst has width of 2. 
e dst has width less than or equal to the combined width of the other operands. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Notes: Only the least significant width(dst) bytes of the result is copied into dst. The macro 
versions of these instruction, denoted in the below cycle table with the macro comment, do 
not allow the overlapping of the destination and sources. The hardware and microcode 
instructions however, allow the destination and sources to overlap. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m2u m2u 3 3 hardware 

m2u m2u i2u 3 3 hardware 

m2s m2s m2s 3 3 hardware 

m2s m2s i2s 3 3 hardware 

p2u p2u p2u 4 5 microcode 

p2u p2u m2u 6 7|2 poly macro 

p2u p2u 120 6 7|2 poly macro 
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mul.lo (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO * srcl (least significant bytes of). 
Constraints: 
e All operands have type integer. 
e dst cannot overlap with the other operands. 
e dst has width less than or equal to the combined width of the other operands. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Notes: Only the least significant width(dst) bytes of the result is copied into dst. The macro 
versions of these instruction, denoted in the below cycle table with the macro comment, do 
not allow the overlapping of the destination and sources. The hardware and microcode 
instructions however, allow the destination and sources to overlap. 


Details: 

dst src0 srce1 cycles latency temporaries | comments 
m2u m2u m2u 3 3 hardware 
m2u m2u i2u 3 3 hardware 
m4u m4u m4u 11 11/2 mono macro 
m4u m4u i4u 11 11}2 mono macro 
m2s m2s m2s 3 3 hardware 
m2s m2s i2s 3 3 hardware 
m4s m4s m4s 11 11/2 mono macro 
m4s m4s i4s 11 11}2 mono macro 
p2u p2u p2u 4 5 microcode 
p2u p2u m2u 6 7|2 poly macro 
p2u p2u 12u 6 7|2 poly macro 
p4u p4u p4u 6 7 microcode 
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Instruction set description 


Details: 
dst src0 src1 cycles latency temporaries | comments 
p4u p4u m4u 10 11/4 poly macro 
p4u p4u i4u 10 11/4 poly macro 
p4s p4s p4s 6 7 microcode 
p4s p4s m4s 10 11/4 poly macro 
p4s p4s ids 10 11/4 poly macro 
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mul.hi (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO * srcl (most significant bytes). 
Constraints: 
e All operands have type integer. 
e dst cannot overlap with the other operands. 
e dst has width less than or equal to the combined width of the other operands. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Notes: Only the most significant width(dst) bytes of the result is copied into dst. The macro 
versions of these instruction, denoted in the below cycle table with the macro comment, do 
not allow the overlapping of the destination and sources. The hardware and microcode 
instructions however, allow the destination and sources to overlap. 


Details: 

dst src0 srce1 cycles latency temporaries | comments 
m2u m2u m2u 3 3 hardware 
m2u m2u i2u 3 3 hardware 
m4u m4u m4u 23 23| 12 mono macro 
m4u m4u i4u 23 23/12 mono macro 
m2s m2s m2s 3 3 hardware 
m2s m2s i2s 3 3 hardware 
m4s m4s m4s 30 30) 14 mono macro 
m4s m4s ids 30 30) 14 mono macro 
p4u p4u p4u 11 12) 8 poly macro 
p4u p4u m4u 15 16) 12 poly macro 
p4u p4u i4u 15 16) 12 poly macro 
p4s p4s p4s 11 12) 8 poly macro 
p4s p4s m4s 15 16) 12 poly macro 
p4s p4s ids 15 16) 12 poly macro 
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div (integer) 


Operands: dst, src0, srcil 


Description: 


dst = srcO / srcl 
Constraints: 
e All operands have type integer. 
e dst cannot overlap with the other operands. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 

m2u m2u m2u 23 23|8 mono macro 

m2u m2u i2u 19 19|8 mono macro 

m4u m4u m4u 40 40/14 mono macro 

m4u m4u i4u 34 34] 14 mono macro 

m2s m2s m2s 44 44) 10 mono macro 

m2s m2s i2s 31 31/10 mono macro 

m4s m4s m4s 69 69/16 mono macro 

m4s m4s ids 50 50/16 mono macro 

p2u p2u p2u 27 28|2 mono 4 macro 
poly 

p2u p2u i2u 21 22|2 mono 4 macro 
poly 

p4u p4u p4u 92 93] 8 poly macro 

p4u p4u idu 70 71|4 poly macro 

p2s p2s p2s 57 58|2 mono 8 macro 
poly 

p2s p2s i2s 36 37|2 mono 8 macro 
poly 

p4s p4s p4s 138 139] 12 poly macro 

p4s p4s i4s 93 94) 8 poly macro 
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div (floating point 32 bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO / srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst cannot overlap with the other operands. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 2 levels of enable stack. 


Details: 
dst src0 src1 cycles latency temporaries | comments 
p4ft p4t p4t 60 61 microcode 
p4ft p4ft i4f 64 65] 4 poly macro 
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Instruction set description 


div (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO / srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst cannot overlap with the other operands. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 2 levels of enable stack. 
Notes: This uses the mutex instruction internally surrounding the 


div.start 


and 

div.end 

instructions. 
Details: 
dst srcO srce1 cycles latency temporaries | comments 
pst psf pst 143 144 macro 
psf psf i8f 151 152) 8 poly macro 
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3.1.3 Shift instructions 


These instructions perform logical and arithmetic shift left and shift right operations. Macro 
shift instructions can be used by including shift.inc. 
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1lsl (unsigned) 


Operands: dst, src0, srcil 


Description: 


dst = srcO << srcl (unsigned left shift) 
Constraints: 

e dst has type unsigned. 

e dst has equal width to srcO 

e srcO has type unsigned. 

e srci has type unsigned. 

e srci has not a domain of label. 
Side Effects: 

e Updates the status register. 

e Requires up to 1 level of enable stack. 


Notes: 0 is shifted into the least significant bit of dst as it is shifted left. It is more efficient to 
shift by an immediate. For a poly dst, shifts of 1, 2 and 4 are more efficient, with width of srcO 
less than 4. The enable stack is unaffected if an immediate shift is used. A poly shift of 4 
bytes by an immediate will require 4 bytes of poly temporaries and 4 extra cycles. The 
macro versions of these instruction, denoted in the below cycle table with the macro 
comment, do not allow the overlapping of the destination and sources. The hardware and 
microcode instructions however, allow the destination and sources to overlap. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m2u m2u 2 2 hardware 
m2u m2u lou 2 2 hardware 
m2u m2u 2 w 2 2 hardware 
m2u m2u 3 u 3 3 hardware 
m2u m2u 4u 2 2 hardware 
m2u m2u 5 u 3 3 hardware 
m2u m2u 6 u 3 3 hardware 
m2u m2u Tou 3 3 hardware 
m2u m2u 8 u 2 2 hardware 
m2u m2u 9 u 3 3 hardware 
m2u m2u 10 wl 3 3 hardware 
m2u m2u 11 u 3 3 hardware 
m2u m2u 12 u 3 3 hardware 
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Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m2u 13 u 3 3 hardware 
m2u m2u 14 u 3 3 hardware 
m2u m2u 15 u 2 2 hardware 
m4u m4u m4u 18 18) 2 mono macro 
m4u m4u 1ou 7 7|2 mono macro 
m4u m4u 2 © 8 8| 2 mono macro 
m4u m4u 3 u 10 10) 2 mono macro 
m4u m4u 4 u 8 8| 2 mono macro 
m4u m4u 5 u 10 10) 2 mono macro 
m4u m4u 6 u 10 10) 2 mono macro 
m4u m4u Tou 10 10) 2 mono macro 
m4u m4u 8 u 7 7|2 mono macro 
m4u m4u 9 u 10 10) 2 mono macro 
m4u m4u 10 u 10 10) 2 mono macro 
m4u m4u 11 u 10 10) 2 mono macro 
m4u m4u 12 wl 9 9| 2 mono macro 
m4u m4u 13 u 10 10) 2 mono macro 
m4u m4u 14 u 9 9| 2 mono macro 
m4u m4u 15 u 7 7|2 mono macro 
m4u m4u 16 u 3 3 macro 
m4u m4u 17 u 3 3 macro 
m4u m4u 1g wv 3 3 macro 
m4u m4u 19 u 4 4 macro 
m4u m4u 20 w 3 3 macro 
m4u m4u 21 u 4 4 macro 
m4u m4u DD il 4 4 macro 
m4u m4u 23 u 4 4 macro 
m4u m4u 24 u 3 3 macro 
m4u m4u 25 u 4 4 macro 
m4u m4u 26 u 4 4 macro 
m4u m4u 27 u 4 4 macro 
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Instruction set description 


Details: 

dst src0 srce1 cycles latency temporaries | comments 
m4u m4u 23 wl 4 4 macro 
m4u m4u 29 u 4 4 macro 
m4u m4u 30 u 4 4 macro 
m4u m4u 31 ou 3 3 macro 
plu plu plu 9 10 microcode 
plu plu m2u 10 11/1 poly macro 
plu plu i v 1 2 microcode 
plu plu 2 u 2 3 microcode 
plu plu 3 u 3 4 macro 
plu plu 4 u 2 3 microcode 
plu plu 5 we 5 6 macro 
plu plu 6 u 6 7 macro 
plu plu 7 wv 5 6 macro 
p2u p2u p2u 12 13 microcode 
p2u p2u m2u 14 15) 2 poly macro 
p2u p2u lou 2 3 microcode 
p2u p2u 2 wt 3 4 microcode 
p2u p2u 3 u 5 6 macro 
p2u p2u 4 u 3 4 microcode 
p2u p2u 5 u 8 9 macro 
p2u p2u 6 u 10 11 macro 
p2u p2u Tou 8 9 macro 
p2u p2u 8 u 6 7 macro 
p2u p2u 9 u 11 12 macro 
p2u p2u 1@ w 16 17 macro 
p2u p2u 11 ou 18 19 macro 
p2u p2u 12 © 20 21 macro 
p2u p2u 13° ay 18 19 macro 
p2u p2u 14 u 16 17 macro 
p2u p2u 15 u 14 15 macro 
p4u p4u p4u 30 30 microcode 
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Details: 

dst src0 src1 cycles latency temporaries | comments 
p4u p4u m4u 34 35] 4 poly macro 
p4u p4u i wu 4 5 microcode 
p4u p4u 2 u 5 6 microcode 
p4u p4u 3 U 16 17) 4 poly macro 
p4u p4u 4 u 5 6 microcode 
p4u p4u 5 Uv 21 22) 4 poly macro 
p4u p4u 6 u 32 33] 4 poly macro 
p4Au p4u 7 28 29) 4 poly macro 
p4u p4u 8 u 17 18) 4 poly macro 
p4u p4u 9 u 33 34] 4 poly macro 
p4u p4u 10 u 49 50] 4 poly macro 
p4u p4u li wv 53 54] 4 poly macro 
p4u p4u 12 u 64 65] 4 poly macro 
p4u p4u 13 wo 60 61] 4 poly macro 
p4u p4u 14 u 56 57|4 poly macro 
p4u p4u 15 w 52 53] 4 poly macro 
p4u p4u 16 u 41 42) 4 poly macro 
p4u p4u i? w 57 58] 4 poly macro 
p4u p4u 18 u 73 74|4 poly macro 
p4u p4u 19 u 89 90) 4 poly macro 
p4u p4u 20 u 105 106] 4 poly macro 
p4Au p4u 2 © 109 110/4 poly macro 
p4u p4u 22 u 113 114/4 poly macro 
p4Au p4u 23 6 117 118) 4 poly macro 
p4u p4u 24 u 128 129] 4 poly macro 
p4u p4u 25 G 124 125) 4 poly macro 
p4u p4u 26 u 120 121] 4 poly macro 
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Details: 

dst src0 src1 cycles latency temporaries | comments 
p4u p4u 27 Bl 116 117/4 poly macro 

p4u p4u 28 u 112 113] 4 poly macro 

p4u p4u 29 wl 108 109) 4 poly macro 

p4u p4u 30 u 104 105) 4 poly macro 

p4u p4u 3 w 100 101) 4 poly macro 
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lsr (unsigned) 


Operands: dst, src0, srcil 


Description: 


dst = srcO >> srcl (unsigned right shift) 
Constraints: 

e dst has type unsigned. 

e dst has equal width to srcO 

e srcO has type unsigned. 

e srci has type unsigned. 

e srcl has not a domain of label. 
Side Effects: 

e Updates the status register. 


Notes: 0 is shifted into the most significant bit of dst as it is shifted right. It is more efficient to 
shift by an immediate. For a poly dst, shifts of 1, 2 and 4 are more efficient, with width of srcO 
less than 4. The enable stack is unaffected if an immediate shift is used. A poly shift of 4 
bytes by an immediate will require 4 bytes of poly temporaries and 4 extra cycles. The 
macro versions of these instruction, denoted in the below cycle table with the macro 
comment, do not allow the overlapping of the destination and sources. The hardware and 
microcode instructions however, allow the destination and sources to overlap. 


Details: 

dst src0 srce1 cycles latency temporaries | comments 
m2u m2u m2u 2 2 hardware 
m2u m2u lou 2 2 hardware 
m2u m2u 2 ib 2 2 hardware 
m2u m2u 3 u 3 3 hardware 
m2u m2u AY 2 2 hardware 
m2u m2u 5 u 3 3 hardware 
m2u m2u 6 u 3 3 hardware 
m2u m2u Jou 3 3 hardware 
m2u m2u 2 v 2 2 hardware 
m2u m2u 9 u 3 3 hardware 
m2u m2u 10 u 3 3 hardware 
m2u m2u 11 u 3 3 hardware 
m2u m2u 12 wi 3 3 hardware 
m2u m2u 13: a 3 3 hardware 
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Details: 

dst src0 srce1 cycles latency temporaries | comments 
m2u m2u 14 u 3 3 hardware 
m2u m2u 15 u 2 2 hardware 
m4u m4u m4u 18 18) 2 mono macro 
m4u m4u 1ou 7 7|2 mono macro 
m4u m4u 2 8 8|2 mono macro 
m4u m4u 3 u 10 10) 2 mono macro 
m4u m4u 4 u 8 8|2 mono macro 
m4u m4u 5 u 10 10) 2 mono macro 
m4u m4u 6 u 10 10|2 mono macro 
m4u m4u Tou 10 10) 2 mono macro 
m4u m4u 8 u 7 7|2 mono macro 
m4u m4u 9 u 10 10) 2 mono macro 
m4u m4u 1@ w 10 10|2 mono macro 
m4u m4u 11 u 10 10|2 mono macro 
m4u m4u 12 © 9 9|2 mono macro 
m4u m4u 13 u 10 10) 2 mono macro 
m4u m4u 14 u 9 9|2 mono macro 
m4u m4u 15 u 7 7|2 mono macro 
m4u m4u 16 u 7 7|2 mono macro 
m4u m4u 17 u 3 3 macro 
m4u m4u 18 u 3 3 macro 
m4u m4u 19 u 4 4 macro 
m4u m4u 20 w 3 3 macro 
m4u m4u 21 u 4 4 macro 
m4u m4u DD 4 4 macro 
m4u m4u 23 u 4 4 macro 
m4u m4u 24 u 3 3 macro 
m4u m4u 25. U 4 4 macro 
m4u m4u 26 u 4 4 macro 
m4u m4u 27 u 4 4 macro 
m4u m4u 28 iG 4 4 macro 
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Details: 

dst src0 src1 cycles latency temporaries | comments 
m4u m4u 29 u 4 4 macro 
m4u m4u 30 u 4 4 macro 
m4u m4u 31 u 3 3 macro 
plu plu plu 9 10 microcode 
plu plu m2u 10 11) 1 poly macro 
plu plu i wv 2 3 microcode 
plu plu 2 u 2 3 microcode 
plu plu 3 U 4 5 macro 
plu plu 4 u 2 3 microcode 
plu plu 5 wv 6 7 macro 
plu plu 6 u 8 9 macro 
plu plu Tw 6 7 macro 
p2u p2u p2u 12 13 microcode 
p2u p2u m2u 14 15) 2 poly macro 
p2u p2u lou 3 4 microcode 
p2u p2u 2b 3 4 microcode 
p2u p2u 3 u 6 7 macro 
p2u p2u 4 u 3 4 microcode 
p2u p2u 5 u 9 10 macro 
p2u p2u 6 u 12 13 macro 
p2u p2u Jou 9 10 macro 
p2u p2u 8 u 6 7 macro 
p2u p2u 9 u 12 13 macro 
p2u p2u 10 wv 18 19 macro 
p2u p2u 11 u 21 22 macro 
p2u p2u 12 w 24 25 macro 
p2u p2u 13 u 21 22 macro 
p2u p2u 14 u 18 19 macro 
p2u p2u 15 u 15 16 macro 
p4u p4u p4u 29 29 microcode 
p4u p4u m4u 33 34] 4 poly macro 
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Instruction set description 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4u p4u 1 wv 5 6 microcode 
p4u p4u 2 u 5 6 microcode 
p4u p4u 3 17 18) 4 poly macro 
p4u p4u 4 u 5 6 microcode 
p4u p4u 5 v 29 30] 4 poly macro 
p4u p4u 6 u 41 42) 4 poly macro 
p4u p4u 1 wv 29 30] 4 poly macro 
p4u p4u 8 u 17 18) 4 poly macro 
p4u p4u ou 41 42) 4 poly macro 
p4u p4u 10 u 65 66] 4 poly macro 
p4u p4u ii w 77 78) 4 poly macro 
p4u p4u 12 u 89 90] 4 poly macro 
p4u p4u 13 u 77 78) 4 poly macro 
p4u p4u 14 u 65 66] 4 poly macro 
p4u p4u 15 wv 53 54] 4 poly macro 
p4u p4u 16 u 41 42) 4 poly macro 
p4u p4u 7 65 66] 4 poly macro 
p4u p4u 18 u 89 90] 4 poly macro 
p4u p4u 19 u 113 114/4 poly macro 
p4u p4u 20 u 137 138] 4 poly macro 
p4u p4u 2i wl 149 150) 4 poly macro 
p4u p4u 22 u 161 162] 4 poly macro 
p4u p4u 23 173 174| 4 poly macro 
p4u p4u 24 u 185 186] 4 poly macro 
p4u p4u 25 wl 173 174| 4 poly macro 
p4u p4u 26 u 161 162] 4 poly macro 
p4u p4u DY © 149 150) 4 poly macro 
p4u p4u 28 u 137 138) 4 poly macro 
p4u p4u 29 u 125 126) 4 poly macro 
p4u p4u 30 u 113 114/4 poly macro 
p4u p4u ail w 101 102) 4 poly macro 
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asr 
Operands: dst, src0, srcil 
Description: 
dst = srcO >> srcel (arithmetic right shift) 
Constraints: 
e dst has type signed. 
e dst has equal width to srcO 
e srcO has type signed. 
e srci has type unsigned. 
e srci has not a domain of label. 
Side Effects: 
e Updates the status register. 
Notes: 1 is shifted into the most significant bit if the msb of srcO is set to 1, otherwise 0. Itis 
more efficient to shift by an immediate. For a poly dst, shifts of 1, 2 and 4 are more efficient, 
with width of srcO less than 4. The enable stack is unaffected if an immediate shift is used. 
This instruction does not allow the overlapping of the destination and sources. 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2s m2s m2u 2 2 hardware 
m2s m2s liu 2 2 hardware 
m2s m2s 20 2 2 hardware 
m2s m2s 3.u 3 3 hardware 
m2s m2s 4 u 2 2 hardware 
m2s m2s 5 u 3 3 hardware 
m2s m2s 6 u 3 3 hardware 
m2s m2s Jou 3 3 hardware 
m2s m2s 8 u 2 2 hardware 
m2s m2s 9 u 3 3 hardware 
m2s m2s 10 u 3 3 hardware 
m2s m2s 11 iu 3 3 hardware 
m2s m2s 12 © 3 3 hardware 
m2s m2s 13 u 3 3 hardware 
m2s m2s 14 u 3 3 hardware 
m2s m2s 15 u 2 2 hardware 
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Details: 

dst src0 srce1 cycles latency temporaries | comments 
m4s m4s m4u 19 19/2 mono macro 
m4s m4s lou 7 7|2 mono macro 
m4s m4s 2 8 8|2 mono macro 
m4s m4s 3 u 10 10] 2 mono macro 
m4s m4s 4u 8 8|2 mono macro 
m4s m4s 5 u 10 10) 2 mono macro 
m4s m4s 6 u 10 10|2 mono macro 
m4s m4s Tou 10 10) 2 mono macro 
m4s m4s 8 u 7 7|2 mono macro 
m4s m4s 9 u 10 10) 2 mono macro 
m4s m4s 10 u 10 10|2 mono macro 
m4s m4s 11 u 10 10) 2 mono macro 
m4s m4s IZ w AS) 9|2 mono macro 
m4s m4s 13 u 10 10|2 mono macro 
m4s m4s 14 u AS) 9|2 mono macro 
m4s m4s 15 u 7 7|2 mono macro 
m4s m4s 16 u 7 7|2 mono macro 
m4s m4s 17 u 4 4 macro 
m4s m4s 18 u 4 4 macro 
m4s m4s 19 u 5 5 macro 
m4s m4s 20. wl 4 4 macro 
m4s m4s 21 u 5 5 macro 
m4s m4s 22 il 5 5 macro 
m4s m4s 23 u 5 5 macro 
m4s m4s 24 u 4 4 macro 
m4s m4s 25:0 5 5 macro 
m4s m4s 26 u 5 5 macro 
m4s m4s 27 u 5 5 macro 
m4s m4s 280 5 5 macro 
m4s m4s 29 u 5 5 macro 
m4s m4s 30 wv 5 5 macro 
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Details: 

dst src0 src1 cycles latency temporaries | comments 
m4s m4s 31 u 4 4 macro 
pls pls plu 9 10 microcode 
pls pls m2u 10 11) 1 poly macro 
pls pls il © 3 4 microcode 
pls pls 2 u 4 5 microcode 
pls pls 3 U 7 8 macro 
pls pls 4 u 4 5 microcode 
pls pls 5 wv 11 12 macro 
pls pls 6 u 14 15 macro 
pls pls Tu 11 12 macro 
p2s p2s p2u 12 13 microcode 
p2s p2s m2u 14 15) 2 poly macro 
p2s p2s lou 4 5 microcode 
p2s p2s 2 5 6 microcode 
p2s p2s 3 u 9 10 macro 
p2s p2s 4 u 5 6 microcode 
p2s p2s 5 u 14 15 macro 
p2s p2s 6 u 18 19 macro 
p2s p2s Tou 14 15 macro 
p2s p2s 8 u 10 11 macro 
p2s p2s 9 u 19 20 macro 
p2s p2s 1@ © 28 29 macro 
p2s p2s 11 u 32 33 macro 
p2s p2s 12 © 36 37 macro 
p2s p2s 13 u 32 33 macro 
p2s p2s i4 wl 28 29 macro 
p2s p2s 15 u 24 25 macro 
p4s p4s p4u 32 33 microcode 
p4s p4s m4u 36 37|4 poly macro 
p4s p4s i w 7 8 microcode 
p4s p4s 2u 8 9 microcode 
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Details: 

dst src0 src1 cycles latency temporaries | comments 
p4s p4s 3 U 15 16 macro 
p4s p4s 4 u 8 9 microcode 
p4s p4s 5 Uv 23 24 macro 
p4s p4s 6 u 30 31 macro 
p4s p4s 7 23 24 macro 
p4s p4s 8 u 16 17 macro 
p4s p4s 9 u 31 32 macro 
p4s p4s 10 u 46 47 macro 
p4s p4s li w 53 54 macro 
p4s p4s 12 u 60 61 macro 
p4s p4s 13 wl 53 54 macro 
p4s p4s 14 u 46 47 macro 
p4s p4s 15 wo 39 40 macro 
p4s p4s 16 u 32 33 macro 
p4s p4s 7 wv 47 48 macro 
p4s p4s 18 u 62 63 macro 
p4s p4s 19 u 77 78 macro 
p4s p4s 20 u 92 93 macro 
p4s p4s Di w 99 100 macro 
p4s p4s 22 u 106 107 macro 
p4s p4s 23 113 114 macro 
p4s p4s 24 u 120 121 macro 
p4s p4s 25) 113 114 macro 
p4s p4s 26 u 106 107 macro 
p4s p4s 27 99 100 macro 
p4s p4s 28 u 92 93 macro 
p4s p4s 29 u 85 86 macro 
p4s p4s 30 u 78 79 macro 
p4s p4s 31 u 71 72 macro 
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lslc (unsigned) 


Operands: dst, src0, srcil 


Description: 


dst = srcO << srcl with carry (unsigned left shift) 
Constraints: 

e dst has type unsigned. 

e dst has equal width to srcO 

e srcO has type unsigned. 

e srci has type unsigned. 

e srci has not a domain of label. 
Side Effects: 

e Updates the status register. 

e Requires up to 1 level of enable stack. 


Notes: The carry is shifted into the least significant bit of dst as it is shifted left. It is more 
efficient to shift by an immediate. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m2u m2u hardware 

m2u m2u i2u hardware 
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lsrce (unsigned) 


Operands: dst, src0, srcil 


Description: 


dst = srcO >> srcl (unsigned right shift) 
Constraints: 

e dst has type unsigned. 

e dst has equal width to srcO 

e srcO has type unsigned. 

e srci has type unsigned. 

e srcl has not a domain of label. 
Side Effects: 

e Updates the status register. 


Notes: The carry is shifted into the most significant bit of dst as it is shifted left. It is more 
efficient to shift by an immediate. 


Details: 
dst srcO src1 cycles latency temporaries | comments 
m2u m2u m2u hardware 
m2u m2u i2u hardware 
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3.1.4 Bitwise instructions 


These instructions perform and, not, or and xor bitwise operations. Macro boolean 
instructions can be used by including bool.inc. 
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not 
Operands: dst, src 
Description: 
dst = ~sre (bitwise inverse) 
Constraints: 
e All operands have type integer. 
Side Effects: 
e Updates the status register. 
Notes: This can be chained together with 
not.extend 
if the combined zero status is important. 
Details: 
dst src cycles latency temporaries | comments 
m2us m2us 1 1 hardware 
m2us il2us 2 2 hardware 
m4us m4us 2 2 macro 
m4us il4us 4 4 macro 
plus plus 1 2 microcode 
plus m2us 1 2 microcode 
plus LLL ws) 1 2 microcode 
p2us p2us 2 3 microcode 
p2us m2us 2 3 microcode 
p2us il2us 2 3 microcode 
p4us p4us 4 5 macro 
p4us m4us 4 5 macro 
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not.extend 


Operands: dst, src 


Description: 


dst = ~src (bitwise inverse) 
Constraints: 
e All operands have type integer. 
Side Effects: 
e Updates the status register. 
Notes: This is used to extend the status of a 
not 
or 
not.extend 


instruction. The zero flag is the and of the previous status flag and the result of this one. 


Details: 
dst src cycles latency temporaries | comments 
m2us m2us 1 1 hardware 
m2us il2us 2 2 hardware 
p2us p2us 2 3 microcode 
p2us m2us 2 3 microcode 
p2us LILZuS 2 3 microcode 
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and 

Operands: dst, src0, srcil 

Description: 

dst = src0O & srcl (bitwise and) 

Constraints: 

e All operands have type integer. 
Side Effects: 
e Updates the status register. 

Notes: This can be chained together with 

and.extend 

if the combined zero status is important. 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
m4us m4us m4us 2 2 macro 
m4us m4us il4us 4 4 macro 
plus plus plus 1 2 microcode 
plus plus m2us 1 2 microcode 
plus plus illus 1 2 microcode 
p2us p2us p2us 2 3 microcode 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
p4us p4us p4us 4 5 microcode 
p4us p4us m4us 4 5 macro 
p4us p4us il4us 4 5 macro 
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and.extend 


Operands: dst, src0, srcl 


Description: 


dst = srcO & srcl 


Constraints: 


(bitwise and) 


e All operands have type integer. 


Side Effects: 


e Updates the status register. 


Notes: This is used to extend the status of an 


and 
or 


and.extend 


instruction. The zero flag is the bitwise and of the previous status flag and the result of this 


one. 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
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or 
Operands: dst, src0, srcil 
Description: 
dst = srcO | srcl (bitwise or) 
Constraints: 
e All operands have type integer. 
e All operands have equal width to 
Side Effects: 
e Updates the status register. 
Notes: This can be chained together with 
or.extend 
if the combined zero status is important. 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
m4us m4us m4us 2 2 macro 
m4us m4us il4us 4 4 macro 
plus plus plus 1 2 microcode 
plus plus m2us 1 2 microcode 
plus plus illus 1 2 microcode 
p2us p2us p2us 2 3 microcode 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
p4us p4us p4us 4 5 microcode 
p4us p4us m4us 4 5 macro 
p4us p4us il4us 4 5 macro 
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or.extend 


Operands: dst, src0, srcil 


Description: 


dst = srcO | 


Constraints: 


srcl 


(bitwise and) 


e All operands have type integer. 


Side Effects: 


e Updates the status register. 


Notes: This is used to extend the status of an 


or 
instruction or an 


or.extend 


instruction. The zero flag is the bitwise and of the previous status flag and the result of this 


one. 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
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Instruction set description 


xor 
Operands: dst, src0, srcil 
Description: 
dst = srcO “ srcl (bitwise xor) 
Constraints: 
e All operands have type integer. 
e All operands have equal width to 
Side Effects: 
e Updates the status register. 
Notes: This can be chained together with 
xor.extend 
if the combined zero status is important. 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
m4us m4us m4us 2 2 macro 
m4us m4us il4us 4 4 macro 
plus plus plus 1 2 microcode 
plus plus m2us 1 2 microcode 
plus plus illus 1 2 microcode 
p2us p2us p2us 2 3 microcode 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
p4us p4us p4us 4 5 microcode 
p4us p4us m4us 4 5 macro 
p4us p4us il4us 4 5 macro 
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CSX600 Instruction Set 


xor.extend 


Operands: dst, src0, srcl 


Description: 


dst = src0O % 


Constraints: 


srcl 


(bitwise xor) 


e All operands have type integer. 


Side Effects: 


e Updates the status register. 


Notes: This is used to extend the status of an 


XOr 
instruction or an 


xor.extend 


instruction. The zero flag is the bitwise and of the previous status flag and the result of this 


one. 
Details: 
dst src0 src1 cycles latency temporaries | comments 
m2us m2us m2us 1 1 hardware 
m2us m2us il2us 2 2 hardware 
p2us p2us m2us 2 3 microcode 
p2us p2us il2us 2 3 microcode 
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3.1.5 Compare instructions 


These instructions perform comparisons of 2 numbers. They either set the status register or 
set a destination to be non-zero if the comparison is true. Macro compare instructions can 
be used by including cmp.inc. 
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cmp (floating point 32 bit) 


Operands: src0O, srcil 
Description: 
Sets the floating point add status register as if srcl was 
subtracted from src0Q. 
Constraints: 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Notes: The mono version of this instruction will set the predicates. The negative and zero 
flag will map onto the normal status flags. See the reference manual for more information on 
how the other predicates are set. The poly version will set a seperate floating point add 
status register. This register can be accessed by the instruction status.fpadd.get and also 
the instructions prefixed with 


if.fpadd 
and 
andif.fpadd 


. There is no way to restore the poly floating point status, so it is advisable to only do poly 
floating point operations in a single thread. 


Details: 

src0 src1 cycles latency temporaries | comments 
m4f m4f 4 4 hardware 
m4f il4f 9 9|4 mono macro 
p4ft p4ft 4 4 microcode 
p4ft m4f 8 9| 4 poly macro 

p4t Lilie 8 9| 4 poly macro 
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cmp (floating point 64 bit) 


Operands: src0, srcil 
Description: 
Sets the floating point add status register as if srcl was 
subtracted from src0Q. 
Constraints: 
e srcO has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Notes: The mono version of this instruction will set the predicates. The negative and zero 
flag will map onto the normal status flags. See the reference manual for more information on 
how the other predicates are set. The poly version will set a seperate floating point add 
status register. This register can be accessed by the instruction status.fpadd.get and also 
the instructions prefixed with 


if.fpadd 
and 
andif.fpadd 


. There is no way to restore the poly floating point status, so it is advisable to only do poly 
floating point operations in a single thread. 


Details: 

src0 src1 cycles latency temporaries | comments 

m8 £ m8£ 4 4 hardware 

m8f il8f 13 13) 8 mono macro 

pst pst 4 4 microcode 

psf m8f 12 13] 8 poly macro 

pst sLAL Bie 12 13) 8 poly macro 
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cmp (unsigned) 
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Operands: src0O, srcil 


Description: 


Sets the status register as if srcl was subtracted from src0Q. 
Constraints: 

e srcO has type unsigned. 

e srci has type unsigned. 


Side Effects: 

e Updates the status register. 
Details: 
src0 srct cycles latency temporaries | comments 
m2u m2u 1 1 hardware 
m2u i12u 2 2 hardware 
m4u m4u 2 2 macro 
m4u i14u 4 4 macro 
plu plu 1 1 microcode 
plu m2u 1 1 microcode 
plu i ilw 1 1 microcode 
p2u p2u 2 2 microcode 
p2u m2u 2 2 microcode 
p2u il2u 2 2 microcode 
p4Au p4u 4 4 microcode 
p4u m4u 4 4 macro 
p4u i14u 4 4 macro 
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cmp (signed) 


Operands: src0O, srcil 


Description: 


Sets the status register as if srcl was subtracted from src0Q. 
Constraints: 

e srcO has type signed. 

e srci has type signed. 
Side Effects: 

e Updates the status register. 

e Requires up to 1 level of enable stack. 


Details: 
src0 src1 cycles latency temporaries | comments 
m2s m2s 1 1 hardware 
m2s il2s 2 2 hardware 
m4s m4s 2 2 macro 
m4s il4s 4 4 macro 
pls pls 1 1 microcode 
pls m2s 1 1 microcode 
pls ills 1 1 microcode 
p2s p2s 2 2 microcode 
p2s m2s 2 2 microcode 
p2s il2s 2 2 microcode 
p4s p4s 4 4 microcode 
p4s m4s 4 4 macro 
p4s il4s 4 4 macro 
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cmpc (unsigned) 


Operands: src0, srcil 

Description: 

Sets the status register as if srcl was subtracted from srcQ. The 
previous status is then compared to this, and only flags that would 


have been set by this operation and are present in the old status 
are set. 


Constraints: 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Updates the status register. 

Details: 

src0 srct cycles latency temporaries | comments 

m2u m2u 1 1 hardware 

m2u il2u 2 2 hardware 

plu plu 1 1 microcode 

plu m2u 1 1 microcode 

plu aL iLikw 1 1 microcode 

p2u m2u 2 2 microcode 

p2u Zo 2 2 microcode 
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cmpc (signed) 


Operands: src0O, srcil 


Description: 


Sets the status register as if srcl was subtracted from src0Q. The 
previous status is then compared to this, and only flags that would 
have been set by this operation and are present in the old status 
are set. 


Constraints: 
e srcO has type signed. 
e srcl has type signed. 
Side Effects: 
e Updates the status register. 
e Requires up to 1 level of enable stack. 


Details: 

src0 srct cycles latency temporaries | comments 

m2s m2s 1 1 hardware 

m2s il2s 2 2 hardware 

pls pls 1 1 microcode 

pls m2s 1 1 microcode 

pls ills 1 1 microcode 

p2s m2s 2 2 microcode 

p2s LIDS 2 2 microcode 
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CSX600 Instruction Set 


cmp.1lt (floating point 32 bit) 


Operands: dst, src0, srcil 


Description: 


If (srcO < srcl), 


dst = non-zero value, otherwise dst = 0. 


Constraints: 


e dst has type unsigned. 


e srcO has type float. 

e srcO has width of 4. 

e srci has type float. 

e srci has width of 4. 
Side Effects: 


Leaves the status register in an undefined state. 
Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m4f m4f 6 6 macro 

plu p4f p4f 8 9 macro 

plu p4ft m4f 12 13) 4 poly macro 
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cmp.1t (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


If (src0O < srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m8f m8f 6 6 macro 

plu psf psf 8 9 macro 

plu pst m8f 16 17/8 poly macro 
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cmp.1t (unsigned) 


Operands: dst, src0, srcil 


Description: 


If (src0O < srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e srcO has type unsigned. 
e srci has type unsigned. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2u m2u 4 4 macro 

m2u m2u il2u 5 5 macro 

plu plu plu 5 6 macro 

plu plu m2u 5 6 macro 

plu plu illu 5 6 macro 
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cmp.1t (signed) 


Operands: dst, src0, srcl 


Description: 


If (src0O < srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e srcO has type signed. 
e srcl has type signed. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2s m2s 3 3 macro 

m2u m2s i12s 4 4 macro 

plu pls pls 5 6 macro 

plu pls m2s 5 6 macro 

plu pls iilis 5 6 macro 
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cmp.le (floating point 32 bit) 


Operands: dst, src0, srcil 


Description: 


If (srcO <= srcel), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m4f m4f 7 7 macro 

m2u m4f il4f 12 12|4 mono macro 
plu p4ft p4f 10 11/4 poly macro 
plu p4f m4f 12 13) 4 poly macro 
plu p4ft aac 12 13) 4 poly macro 


110 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


cmp.le (floating point 64 bit) 


Operands: dst, src0, srcl 


Description: 


If (srcO <= srcel), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m8f£ m8f 7 7 macro 

m2u m8 f il8f 16 16) 8 mono macro 

plu pst pst 10 11/8 poly macro 

plu psf m8 f 16 17/8 poly macro 

plu pst aI Bie 16 17)8 poly macro 
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cmp.le (unsigned) 


Operands: dst, src0, srcl 


Description: 


If (srcO <= srcel), dst = non-zero value, otherwise dst = 0. 
Constraints: 

e dst has type unsigned. 

e srcO has type unsigned. 

e srci has type unsigned. 
Side Effects: 

e Leaves the status register in an undefined state. 

e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2u m2u 3 3 macro 

m2u m2u il2u 5 5|2 mono macro 

plu plu plu 6 7|1 poly macro 

plu plu m2u 6 7|1 poly macro 

plu plu iin 6 7|1 poly macro 
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cmp.le (signed) 


Operands: dst, src0, srcil 


Description: 


If (srcO <= srcel), dst = non-zero value, otherwise dst = 0. 
Constraints: 

e dst has type unsigned. 

e srcO has type signed. 

e srcl has type signed. 
Side Effects: 

e Leaves the status register in an undefined state. 

e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2s m2s 4 4 macro 

m2u m2s il2s 6 6| 2 mono macro 

plu pls pls 6 7|1 poly macro 

plu pls m2s 6 7|1 poly macro 

plu pls ills 6 7|1 poly macro 
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cmp.eq (floating point 32 bit) 


Operands: dst, src0, srcil 


Description: 


If (srcO == srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m4f m4f 6 6 macro 

m2u m4f il4f 11 11}4 mono macro 

plu p4t p4ft 8 9 macro 

plu p4f m4f 12 13) 4 poly macro 

plu p4ft aac 12 13) 4 poly macro 
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cmp.eq (floating point 64 bit) 


Operands: dst, src0, srcl 


Description: 


If (srcO == srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m8f m8f 6 6 macro 

m2u m8 f il8f 15 15) 8 mono macro 

plu pst pst 8 9 macro 

plu psf m8 f 16 17)8 poly macro 

plu pst aI Bie 16 17)8 poly macro 
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cmp.eq (integer) 


Operands: dst, src0, srcl 


Description: 


If (srcO == srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e srcO has type integer. 
e srci has type integer. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2us m2us 3 3 macro 

m2u m2us il2us 4 4 macro 

plu plus plus 5 6 macro 

plu plus m2us 5 6 macro 

plu plus iL 5 6 macro 
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cmp.ne (floating point 32 bit) 


Operands: dst, src0, srcil 


Description: 


If (srcO != srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type integer. 
e src has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2us m4f m4f 7 7 macro 

m2us m4f il4f 12 12)4 mono macro 

plus p4t p4ft 8 9 macro 

plus p4f m4f 12 13) 4 poly macro 

plus p4ft aac 12 13) 4 poly macro 
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cmp.ne (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


If (srcO != srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type integer. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2us m8f£ m8f 7 7 macro 

m2us m8 f il8f 16 16) 8 mono macro 

plus pst pst 8 9 macro 

plus psf m8 f 16 17)8 poly macro 

plus pst aLIL Sie 16 17)8 poly macro 
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cmp.ne (integer) 


Operands: dst, src0, srcil 


Description: 


If (srcO != srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type integer. 
e srcO has type integer. 
e srci has type integer. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2us m2us m2us 4 4 macro 

m2us m2us il2us 5 5 macro 

plus plus plus 5 6 macro 

plus plus m2us 5 6 macro 

plus plus iL 5 6 macro 
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cmp.gt (floating point 32 bit) 


Operands: dst, src0, srcl 


Description: 


If (srcO > srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m4f m4f 6 6 macro 

m2u m4f il4f 11 11}4 mono macro 

plu p4t p4ft 8 9 macro 

plu p4f m4f 12 13) 4 poly macro 

plu p4ft aac 12 13) 4 poly macro 
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cmp.gt (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


If (srcO > srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m8f m8f 6 6 macro 

m2u m8 f il8f 15 15) 8 mono macro 

plu pst pst 8 9 macro 

plu psf m8 f 16 17)8 poly macro 

plu pst aI Bie 16 17)8 poly macro 
Document No. 06-RM-1137 Revision: 3.A 121 


ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


cmp.gt (unsigned) 


Operands: dst, src0, srcil 


Description: 


If (srcO > srcl), dst = non-zero value, otherwise dst = 0. 
Constraints: 

e dst has type unsigned. 

e srcO has type unsigned. 

e srci has type unsigned. 
Side Effects: 

e Leaves the status register in an undefined state. 

e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2u m2u 4 4 macro 

m2u m2u il2u 6 6| 2 mono macro 

plu plu plu 5 6 macro 

plu plu m2u 6 7|1 poly macro 

plu plu iin 6 7|1 poly macro 
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cmp.gt (signed) 


Operands: dst, src0, srcil 


Description: 


If (srcO > srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e srcO has type signed. 
e srcl has type signed. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2s m2s 3 3 macro 

m2u m2s i12s 5 5) 2 mono macro 

plu pls pls 6 7 macro 

plu pls m2s 6 7 macro 

plu pls ills 6 7 macro 
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cmp.ge (floating point 32 bit) 


Operands: dst, src0, srcil 


Description: 


If (srcO >= srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m4f m4f 7 7 macro 

plu p4f p4f 8 9 macro 

plu p4ft m4f 12 13) 4 poly macro 
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cmp.ge (floating point 64 bit) 


Operands: dst, src0, srcl 


Description: 


If (srcO >= srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
m2u m8f£ m8f 7 7 macro 

plu psf psf 8 9 macro 

plu pst m8f 16 17/8 poly macro 
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cmp.ge (unsigned) 


Operands: dst, src0, srcl 


Description: 


If (srcO >= srcl), dst = non-zero value, otherwise dst = 0 


Constraints: 
e dst has type unsigned. 
e srcO has type unsigned. 
e srci has type unsigned. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2u m2u 3 3 macro 

m2u m2u il2u 4 4 macro 

plu plu plu 5 6 macro 

plu plu m2u 5 6 macro 

plu plu illu 5 6 macro 
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cmp.ge (signed) 


Operands: dst, src0, srcl 


Description: 


If (srcO >= srcl), dst = non-zero value, otherwise dst = 0. 


Constraints: 
e dst has type unsigned. 
e srcO has type signed. 
e srcl has type signed. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 1 level of enable stack. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
m2u m2s m2s 4 4 macro 

m2u m2s i12s 5 5 macro 

plu pls pls 5 6 macro 

plu pls m2s 5 6 macro 

plu pls iilis 5 6 macro 
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3.1.6 Poly if instructions 


All of these operations act on the poly processing elements. They all affect the enable 
stack, except any.enable, allowing conditional code to be run on the poly pes. Macro if 
instructions can be used by including if.inc. 
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if.1lt (floating point 32 bit) 


Operands: src0, srcil 
Description: 
If (srcO < srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 9 10) 4 poly macro 
p4f p4f 5 5 macro 
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if.1lt (floating point 64 bit) 


Operands: src0, srcil 
Description: 
If (srcO < srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
pst m8ft 13 14| 8 poly macro 
pst pst 5 5 macro 
130 Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


if.1t (unsigned) 


Operands: src0O, srcil 
Description: 
If (srcO < srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plu m2u 2 2 macro 

plu illu 2 2 macro 

p2u m2u 3 3 macro 

p2u il2u 3 3 macro 

p4u m4u 5 5 macro 

p4u i14u 5 5 macro 

plu plu 2 2 macro 

p2u p2u 3 3 macro 

p4u p4u 5 5 macro 
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Instruction set description CSX600 Instruction Set 


if.1lt (signed) 


Operands: src0, srcil 
Description: 
If (srcO < srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type signed. 
e srcl has type signed. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO src1 cycles latency temporaries | comments 
pls m2s 2 2 macro 
pls ills 2 2 macro 
p2s m2s 3 3 macro 
p2s il2s 3 3 macro 
p4s m4s 5 5 macro 
p4s il4s 5 5 macro 
pls pls 2 2 macro 
p2s p2s 3 3 macro 
p4s p4s 5 5 macro 
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CSX600 Instruction Set Instruction set description 


if.le (floating point 32 bit) 


Operands: src0, srcil 
Description: 
If (srcO <= srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 9 10) 4 poly macro 
p4f p4f 5 5 macro 
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Instruction set description CSX600 Instruction Set 


if.le (floating point 64 bit) 


Operands: src0O, srcil 
Description: 
If (srcO <= srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
pst m8ft 13 14| 8 poly macro 
pst pst 5 5 macro 
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CSX600 Instruction Set Instruction set description 


if.le (unsigned) 


Operands: src0O, srcil 
Description: 
If (srcO <= srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plu m2u 3 4/1 poly macro 

p2u m2u 5 6| 2 poly macro 

p4u m4u 9 10) 4 poly macro 

plu plu 2 2 macro 

p2u p2u 3 3 macro 

p4u p4u 5 5 macro 
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Instruction set description CSX600 Instruction Set 


if.le (signed) 


Operands: src0, srcil 
Description: 
If (srcO <= srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type signed. 
e srcl has type signed. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO src1 cycles latency temporaries | comments 
pls m2s 3 4/1 poly macro 
p2s m2s 5 6| 2 poly macro 
p4s m4s 9 10) 4 poly macro 
pls pls 2 2 macro 
p2s p2s 3 3 macro 
p4s p4s 5 5 macro 
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CSX600 Instruction Set Instruction set description 


if.eq (floating point 32 bit) 


Operands: src0, srcil 
Description: 
If (srcO == srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

srcO srct cycles latency temporaries | comments 

p4f m4f 9 10) 4 poly macro 

p4f il4f 9 10|4 poly macro 

p4f p4f 5 5 macro 
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Instruction set description 


CSX600 Instruction Set 


if.eq (floating point 64 bit) 


138 


Operands: src0, srcil 


Description: 


If (srcO == srcl), push enable stack with 1 


(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


(enable), 


otherwise 0 


Side Effects: 

e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
p8f m8£ 13 14/8 poly macro 
pst il8f 13 14] 8 poly macro 
pst pst 5 5 macro 
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CSX600 Instruction Set Instruction set description 


if.eq (integer) 


Operands: src0O, srcil 
Description: 
If (srcO == srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e src0 has type integer. 
e srci has type integer. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plus m2us 2 2 macro 

plus illus 2 2 macro 

p2us m2us 3 3 macro 

p2us il2us 3 3 macro 

p4us m4us 5 5 macro 

p4us il4us 5 5 macro 

plus plus 2 2 macro 

p2us p2us 3 3 macro 

p4us p4us 5 5 macro 
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Instruction set description CSX600 Instruction Set 


if.ne (floating point 32 bit) 


Operands: src0, srcil 
Description: 
If (srcO != srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srcl has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 9 10) 4 poly macro 
p4f il4f 9 10] 4 poly macro 
p4f p4f 5 5 macro 
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CSX600 Instruction Set Instruction set description 


if.ne (floating point 64 bit) 


Operands: src0, srcil 
Description: 
If (srcO != srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 srct cycles latency temporaries | comments 

pst m8ft 13 14| 8 poly macro 

pst il8f 13 14] 8 poly macro 

pst pst 5 5 macro 
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Instruction set description CSX600 Instruction Set 


if.ne (integer) 


Operands: src0, srcil 
Description: 
If (srcO != srcel), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e src0 has type integer. 
e srci has type integer. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 
src0 src1 cycles latency temporaries | comments 
plus m2us 2 2 macro 
plus illus 2 2 macro 
p2us m2us 3 3 macro 
p2us il2us 3 3 macro 
p4us m4us 5 5 macro 
p4us il4us 5 5 macro 
plus plus 2 2 macro 
p2us p2us 3 3 macro 
p4us p4us 5 5 macro 
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CSX600 Instruction Set Instruction set description 


if.gt (floating point 32 bit) 


Operands: src0, srcil 
Description: 
If (srcO > srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 10 11/4 poly macro 
p4f p4ft 6 6 macro 
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Instruction set description CSX600 Instruction Set 


if.gt (floating point 64 bit) 


Operands: src0, srcil 
Description: 
If (srcO > srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
pst m8ft 14 15] 8 poly macro 
psf psf 6 6 macro 
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CSX600 Instruction Set Instruction set description 


if.gt (unsigned) 


Operands: src0O, srcil 
Description: 
If (srcO > srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plu m2u 3 4/1 poly macro 

plu illu 3 4/1 poly macro 

p2u m2u 5 6] 2 poly macro 

p2u il2u 5 6| 2 poly macro 

p4u m4u 9 10) 4 poly macro 

p4u i14u 9 10) 4 poly macro 

plu plu 2 2 macro 

p2u p2u 3 3 macro 

p4u p4u 5 5 macro 
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Instruction set description CSX600 Instruction Set 


if.gt (signed) 


Operands: src0, srcil 
Description: 
If (srcO > srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type signed. 
e srcl has type signed. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO src1 cycles latency temporaries | comments 
pls m2s 3 3 macro 
pls ills 3 3 macro 
p2s m2s 4 4 macro 
p2s il2s 4 4 macro 
p4s m4s 6 6 macro 
p4s il4s 6 6 macro 
pls pls 3 3 macro 
p2s p2s 4 4 macro 
p4s p4s 6 6 macro 
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CSX600 Instruction Set Instruction set description 


if.ge (floating point 32 bit) 


Operands: src0, srcil 
Description: 
If (srcO >= srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 9 10) 4 poly macro 
p4f p4f 5 5 macro 
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Instruction set description CSX600 Instruction Set 


if.ge (floating point 64 bit) 


Operands: src0O, srcil 
Description: 
If (srcO >= srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
pst m8ft 13 14| 8 poly macro 
pst pst 5 5 macro 
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CSX600 Instruction Set Instruction set description 


if.ge (unsigned) 


Operands: src0O, srcil 
Description: 
If (srcO >= srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plu m2u 2 2 macro 

plu illu 2 2 macro 

p2u m2u 3 3 macro 

p2u il2u 3 3 macro 

p4u m4u 5 5 macro 

p4u i14u 5 5 macro 

plu plu 2 2 macro 

p2u p2u 3 3 macro 

p4u p4u 5 5 macro 
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Instruction set description CSX600 Instruction Set 


if.ge (signed) 


Operands: src0, srcil 
Description: 
If (srcO >= srcl), push enable stack with 1 (enable), otherwise 0 
(disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type signed. 
e srcl has type signed. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO src1 cycles latency temporaries | comments 
pls m2s 2 2 macro 
pls ills 2 2 macro 
p2s m2s 3 3 macro 
p2s il2s 3 3 macro 
p4s m4s 5 5 macro 
p4s il4s 5 5 macro 
pls pls 2 2 macro 
p2s p2s 3 3 macro 
p4s p4s 5 5 macro 
150 Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


andif.1t (floating point 32 bit) 


Operands: src0O, srcil 
Description: 
If (srcO < srcl), set lsb of enable stack with 1 (enable), otherwise 
0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 9 10) 4 poly macro 
p4f p4f 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.1t (floating point 64 bit) 


Operands: src0, srcil 
Description: 
If (srcO < srcl), set lsb of enable stack with 1 (enable), otherwise 
O (disable). 
Constraints: 
e srcO has domain poly. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
pst m8ft 13 14| 8 poly macro 
pst pst 5 5 macro 
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CSX600 Instruction Set Instruction set description 


andif.1t (unsigned) 


Operands: src0O, srcil 
Description: 
If (srcO < srcl), set lsb of enable stack with 1 (enable), otherwise 
O (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plu m2u 2 2 macro 

plu illu 2 2 macro 

p2u m2u 3 3 macro 

p2u il2u 3 3 macro 

p4u m4u 5 5 macro 

p4u i14u 5 5 macro 

plu plu 2 2 macro 

p2u p2u 3 3 macro 

p4u p4u 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.1t (signed) 


Operands: src0O, srcil 
Description: 
If (srcO < srcl), set lsb of enable stack with 1 (enable), otherwise 
O (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type signed. 
e srcl has type signed. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO src1 cycles latency temporaries | comments 
pls m2s 2 2 macro 
pls ills 2 2 macro 
p2s m2s 3 3 macro 
p2s il2s 3 3 macro 
p4s m4s 5 5 macro 
p4s il4s 5 5 macro 
pls pls 2 2 macro 
p2s p2s 3 3 macro 
p4s p4s 5 5 macro 
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CSX600 Instruction Set Instruction set description 


andif.le (floating point 32 bit) 


Operands: src0O, srcl 


Description: 


If (srcO <= srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 9 10) 4 poly macro 
p4f p4f 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.le (floating point 64 bit) 


156 


Operands: src0O, srcil 


Description: 


If (srcO <= srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 

e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
pst m8ft 13 14| 8 poly macro 
pst pst 5 5 macro 
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CSX600 Instruction Set Instruction set description 


andif.le (unsigned) 


Operands: src0, srcil 
Description: 
If (srcO <= srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plu m2u 3 4/1 poly macro 

p2u m2u 5 6| 2 poly macro 

p4u m4u 9 10) 4 poly macro 

plu plu 2 2 macro 

p2u p2u 3 3 macro 

p4u p4u 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.le (signed) 


Operands: src0, srcil 
Description: 
If (srcO <= srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type signed. 
e srci has type signed. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO src1 cycles latency temporaries | comments 
pls m2s 3 4/1 poly macro 
p2s m2s 5 6| 2 poly macro 
p4s m4s 9 10) 4 poly macro 
pls pls 2 2 macro 
p2s p2s 3 3 macro 
p4s p4s 5 5 macro 
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CSX600 Instruction Set Instruction set description 


andif.eq (floating point 32 bit) 


Operands: src0, srcil 


Description: 


If (srcO == srcl), set lsb of enable stack with 1 
otherwise 0 (disable). 


(enable), 


Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

srcO srct cycles latency temporaries | comments 

p4f m4f 9 10) 4 poly macro 

p4f il4f 9 10] 4 poly macro 

p4f p4f 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.eq (floating point 64 bit) 


160 


Operands: src0O, srcl 
Description: 
If (srcO == srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 

e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
p8f m8£ 13 14/8 poly macro 
pst i1l8f 13 14] 8 poly macro 
pst pst 5 5 macro 
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CSX600 Instruction Set Instruction set description 


andif.eq (integer) 


Operands: src0, srcil 
Description: 
If (srcO == srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type integer. 
e srci has type integer. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plus m2us 2 2 macro 

plus illus 2 2 macro 

p2us m2us 3 3 macro 

p2us il2us 3 3 macro 

p4us m4us 5 5 macro 

p4us il4us 5 5 macro 

plus plus 2 2 macro 

p2us p2us 3 3 macro 

p4us p4us 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.ne (floating point 32 bit) 
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Operands: src0O, srcil 
Description: 
If (srcO != srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srcl has width of 4. 


Side Effects: 

e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4t m4f 9 10) 4 poly macro 
p4f il4f 9 10] 4 poly macro 
p4f p4f 5 5 macro 
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CSX600 Instruction Set Instruction set description 


andif.ne (floating point 64 bit) 


Operands: src0O, srcil 


Description: 
If (srcO != srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 srct cycles latency temporaries | comments 

pst m8ft 13 14| 8 poly macro 

pst il8f 13 14] 8 poly macro 

pst pst 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.ne (integer) 


Operands: src0O, srcil 
Description: 
If (srcO != srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type integer. 
e srci has type integer. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 
src0 src1 cycles latency temporaries | comments 
plus m2us 2 2 macro 
plus illus 2 2 macro 
p2us m2us 3 3 macro 
p2us il2us 3 3 macro 
p4us m4us 5 5 macro 
p4us il4us 5 5 macro 
plus plus 2 2 macro 
p2us p2us 3 3 macro 
p4us p4us 5 5 macro 
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CSX600 Instruction Set Instruction set description 


andif.gt (floating point 32 bit) 


Operands: src0, srcil 
Description: 
If (srcO > srcl), set lsb of enable stack with 1 (enable), otherwise 
0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 10 11/4 poly macro 
p4f p4ft 6 6 macro 
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Instruction set description CSX600 Instruction Set 


andif.gt (floating point 64 bit) 


Operands: src0O, srcil 
Description: 
If (srcO > srcl), set lsb of enable stack with 1 (enable), otherwise 
O (disable). 
Constraints: 
e srcO has domain poly. 
e src has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
pst m8ft 14 15] 8 poly macro 
psf psf 6 6 macro 
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CSX600 Instruction Set Instruction set description 


andif.gt (unsigned) 


Operands: src0O, srcil 
Description: 
If (srcO > srcl), set lsb of enable stack with 1 (enable), otherwise 
O (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plu m2u 3 4/1 poly macro 

plu illu 3 4/1 poly macro 

p2u m2u 5 6] 2 poly macro 

p2u il2u 5 6| 2 poly macro 

p4u m4u 9 10) 4 poly macro 

p4u i14u 9 10) 4 poly macro 

plu plu 3 4/1 poly macro 

p2u p2u 5 6} 2 poly macro 

p4u p4u 7 7|4 poly macro 
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Instruction set description CSX600 Instruction Set 


andif.gt (signed) 


Operands: src0O, srcil 
Description: 
If (srcO > srcl), set lsb of enable stack with 1 (enable), otherwise 
O (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type signed. 
e srcl has type signed. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO src1 cycles latency temporaries | comments 
pls m2s 3 3 macro 
pls ills 3 3 macro 
p2s m2s 4 4 macro 
p2s il2s 4 4 macro 
p4s m4s 6 6 macro 
p4s il4s 6 6 macro 
pls pls 3 3 macro 
p2s p2s 4 4 macro 
p4s p4s 6 6 macro 
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CSX600 Instruction Set Instruction set description 


andif.ge (floating point 32 bit) 


Operands: src0O, srcil 


Description: 


If (srcO >= srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 4. 
e srci has type float. 
e srci has width of 4. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO srct cycles latency temporaries | comments 
p4f m4f 9 10) 4 poly macro 
p4f p4f 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.ge (floating point 64 bit) 


170 


Operands: src0O, srcil 


Description: 


If (srcO >= srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Constraints: 
e srcO has domain poly. 
e srcO has type float. 
e srcO has width of 8. 
e srci has type float. 
e srci has width of 8. 


Side Effects: 

e Leaves the status register in an undefined state. 
Details: 
src0 srct cycles latency temporaries | comments 
p8f m8£ 13 14/8 poly macro 
pst pst 5 5 macro 
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CSX600 Instruction Set Instruction set description 


andif.ge (unsigned) 


Operands: src0O, srcil 
Description: 
If (srcO >= srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type unsigned. 
e srci has type unsigned. 


Side Effects: 
e Leaves the status register in an undefined state. 

Details: 

src0 src1 cycles latency temporaries | comments 

plu m2u 2 2 macro 

plu illu 2 2 macro 

p2u m2u 3 3 macro 

p2u il2u 3 3 macro 

p4u m4u 5 5 macro 

p4u i14u 5 5 macro 

plu plu 2 2 macro 

p2u p2u 3 3 macro 

p4u p4u 5 5 macro 
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Instruction set description CSX600 Instruction Set 


andif.ge (signed) 


Operands: src0O, srcil 
Description: 
If (srcO >= srcl), set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e srcO has domain poly. 
e srcO has type signed. 
e srci has type signed. 


Side Effects: 
e Leaves the status register in an undefined state. 
Details: 
srcO src1 cycles latency temporaries | comments 
pls m2s 2 2 macro 
pls ills 2 2 macro 
p2s m2s 3 3 macro 
p2s il2s 3 3 macro 
p4s m4s 5 5 macro 
p4s il4s 5 5 macro 
pls pls 2 2 macro 
p2s p2s 3 3 macro 
p4s p4s 5 5 macro 
172 Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


if.cry 


Operands: 


Description: 


If carry flag is set, push enable stack with 1 (enable), otherwise 0 
(disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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Instruction set description CSX600 Instruction Set 


if.ncry 


Operands: 


Description: 


If carry flag is not set, push enable stack with 1 (enable), 
otherwise 0 (disable). 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
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CSX600 Instruction Set Instruction set description 


if.zero 


Operands: 


Description: 


If zero flag is set, push enable stack with 1 (enable), otherwise 0 
(disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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Instruction set description CSX600 Instruction Set 


if.nzero 


Operands: 


Description: 


If zero flag is not set, push enable stack with 1 (enable), 
otherwise 0 (disable). 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
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CSX600 Instruction Set Instruction set description 


if.neg 


Operands: 


Description: 


If negative flag is set, push enable stack with 1 (enable), 
otherwise 0 (disable). 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
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Instruction set description CSX600 Instruction Set 


if.nneg 


Operands: 


Description: 


If negative flag is not set, push enable stack with 1 (enable), 
otherwise 0 (disable). 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
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CSX600 Instruction Set 


Instruction set description 


if.msb 


Operands: src 


Description: 
If msb of src is set, push enable stack with 1 (enable), 
(disable). 
Constraints: 

e src has domain poly. 

e src has type integer. 
Details: 
src cycles latency temporaries | comments 
p2us 1 microcode 
plus 1 microcode 
p4us 1 microcode 
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Instruction set description CSX600 Instruction Set 


if.nmsb 


Operands: src 
Description: 
If msb of src is not set, push enable stack with 1 (enable), 
otherwise 0 (disable). 
Constraints: 
e src has domain poly. 
e src has type integer. 


Details: 

src cycles latency temporaries | comments 
p2us 1 1 microcode 
plus 1 1 microcode 
p4us 1 1 microcode 
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CSX600 Instruction Set Instruction set description 


if.fpadd. zero 


Operands: 
Description: 


If zero floating point add flag is set, push enable stack with 1 
(enable), otherwise 0 (disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 


Document No. 06-RM-1137 Revision: 3.A 


181 
ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


if.fpadd.nzero 


Operands: 
Description: 


If zero floating point add flag is not set, push enable stack with 1 
(enable), otherwise 0 (disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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CSX600 Instruction Set Instruction set description 


if.fpadd.neg 


Operands: 


Description: 


If negative floating point add flag is set, push enable stack with 1 
(enable), otherwise 0 (disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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Instruction set description CSX600 Instruction Set 


if.fpadd.nneg 


Operands: 


Description: 


If negative floating point add flag is not set, push enable stack 
with 1 (enable), otherwise 0 (disable). 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
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CSX600 Instruction Set Instruction set description 


andif.cry 


Operands: 
Description: 


If carry flag is set, set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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Instruction set description CSX600 Instruction Set 


andif.ncry 


Operands: 
Description: 


If carry flag is not set, set lsb of enable stack with 1 
otherwise 0 (disable). 


Details: 


(enable), 


cycles latency temporaries | comments 


1 1 microcode 
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CSX600 Instruction Set Instruction set description 


andif.zero 


Operands: 
Description: 


If zero flag is set, set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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Instruction set description CSX600 Instruction Set 


andif .nzero 


Operands: 
Description: 


If zero flag is not set, set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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CSX600 Instruction Set Instruction set description 


andif.neg 


Operands: 
Description: 


If negative flag is set, set lsb of enable stack with 1 (enable), 
otherwise 0 (disable). 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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Instruction set description CSX600 Instruction Set 


andif.nneg 


Operands: 


Description: 


If negative flag is not set, set lsb of enable stack with 1 


(enable), otherwise 0 (disable). 

Details: 

cycles latency temporaries | comments 
1 1 microcode 
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CSX600 Instruction Set Instruction set description 


andif.fpadd. zero 


Operands: 


Description: 


If zero floating point add flag is set, set lsb of enable stack 
with 1 (enable), otherwise 0 (disable). 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
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Instruction set description CSX600 Instruction Set 


andif .fpadd.nzero 


Operands: 


Description: 


If zero floating point add flag is not set, set lsb of enable stack 
with 1 (enable), otherwise 0 (disable). 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
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CSX600 Instruction Set Instruction set description 


andif.fpadd.neg 


Operands: 


Description: 


If negative floating point add flag is set, set lsb of enable stack 
with 1 (enable), otherwise 0 (disable). 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
Document No. 06-RM-1137 Revision: 3.A 193 


ClearSpeed Technology plc 


Instruction set description 


CSX600 Instruction Set 


andif.fpadd.nneg 


194 


Operands: 


Description: 


If negative floating point add flag is not set, set lsb of enable 


stack with 1 


Details: 


(enable), otherwise 0 (disable). 


cycles 


latency 


temporaries | comments 


1 microcode 
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CSX600 Instruction Set Instruction set description 


else 

Operands: 

Description: 

Toggle the least significant bit of enable stack 

Details: 

cycles latency temporaries | comments 
1 1 microcode 

Document No. 06-RM-1137 Revision: 3.A 195 


ClearSpeed Technology plc 


Instruction set description 


CSX600 Instruction Set 


endif 


196 


Operands: 


Description: 


Pop the enable stack, shifts stack right putting 1 in the msb 


Details: 


cycles 


latency 


temporaries | comments 


1 microcode 
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CSX600 Instruction Set Instruction set description 


any .enable 


Operands: dst 


Description: 


If any pes ar nabled, then dst is set to 1, otherwise 0. 
Constraints: 

e dst has domain mono. 

e dst has width of 2. 


Notes: There will be an additional delay on the real hardware of 10-20 cycles, as the 
information filters through the hardware pipeline, before the result is available to the user. 


Details: 
dst cycles latency temporaries | comments 
m2 6 6 macro 
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198 


Instruction set description CSX600 Instruction Set 


all.disable 


Operands: dst 


Description: 


If all pes are disabled, then dst is set to 1, otherwise 0 
Constraints: 

e dst has domain mono. 

e dst has width of 2. 


Notes: There will be an additional delay on the real hardware of 10-20 cycles, as the 


information filters through the hardware pipeline, before the result is available to the user. 
This is more efficient than any.enable. 


Details: 
dst cycles latency temporaries | comments 
m2 5 5 macro 
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CSX600 Instruction Set Instruction set description 


3.1.7 Predicate instructions 


These instructions perform bit operations on predicates as well as reading to and from the 
mono register file. Bit operations require immediates to refer to individual predicate bits 
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Instruction set description CSX600 Instruction Set 


pred.clr 
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Operands: dst 


Description: 


predicate dst = 0. 


Constraints: 
e dst has domain immediate. 
e dst has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0:CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 


dst cycles latency temporaries | comments 


i4 1 1 hardware 
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CSX600 Instruction Set Instruction set description 


pred.set 


Operands: dst 


Description: 


predicate dst = l. 


Constraints: 
e dst has domain immediate. 
e dst has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0:CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 
dst cycles latency temporaries | comments 
i4 1 1 hardware 
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Instruction set description CSX600 Instruction Set 


pred.mov 


202 


Operands: dst, src 


Description: 


predicate dst = predicate src. 


Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e src has domain immediate. 

e src has width of 4. 
Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 


the various predicate bits are as follows 0: CarryorFPUinexact 1:Zero 2:MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 


dst src cycles latency temporaries | comments 


i4 i4 1 1 hardware 
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CSX600 Instruction Set Instruction set description 


pred.not 


Operands: dst, src 


Description: 


predicate dst = inverse of predicate. 


Constraints: 
e dst has domain immediate. 
e dst has width of 4. 
e src has domain immediate. 
e src has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0: CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src cycles latency temporaries | comments 

i4 i4 1 1 hardware 
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Instruction set description 


CSX600 Instruction Set 


pred.and 


Operands: dst, src0, srcil 


Description: 


predicat 


dst = predicate src0O & predicate srcl. 
Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e srcO has domain immediate. 

e srcO has width of 4. 

e srci has domain immediate. 

e srci has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0:CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src0 src1 cycles latency temporaries | comments 

i4 i4 i4 1 1 hardware 
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CSX600 Instruction Set 


Instruction set description 


pred.nand 


Operands: dst, src0, srcil 


Description: 


predicat 


dst = inverse of (predicate srcO & predicate srcl). 
Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e srcO has domain immediate. 

e srcO has width of 4. 

e srci has domain immediate. 

e srci has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0: CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src0 src1 cycles latency temporaries | comments 

i4 i4 i4 1 1 hardware 
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Instruction set description 


CSX600 Instruction Set 


pred.or 


Operands: dst, src0, srcil 


Description: 


predicat 


dst = predicate src0O | predicate srcl. 
Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e srcO has domain immediate. 

e srcO has width of 4. 

e srci has domain immediate. 

e srci has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0:CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src0 src1 cycles latency temporaries | comments 

i4 i4 i4 1 1 hardware 
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CSX600 Instruction Set 


Instruction set description 


pred.nor 


Operands: dst, src0, srcil 


Description: 


predicat 


dst = inverse of (predicate srcO | predicate srcl). 
Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e srcO has domain immediate. 

e srcO has width of 4. 

e srci has domain immediate. 

e srci has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0: CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src0 src1 cycles latency temporaries | comments 

i4 i4 i4 1 1 hardware 
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Instruction set description 


CSX600 Instruction Set 


pred.xor 


Operands: dst, src0, srcl 


Description: 


predicat 


dst = predicate src0O * predicate srcl. 
Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e srcO has domain immediate. 

e srcO has width of 4. 

e srci has domain immediate. 

e srci has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0:CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src0 src1 cycles latency temporaries | comments 

i4 i4 i4 1 1 hardware 
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CSX600 Instruction Set 


Instruction set description 


pred.nxor 


Operands: dst, src0, srcil 


Description: 


predicat 


dst = ~(predicate src0O “ predicate srcl). 
Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e srcO has domain immediate. 

e srcO has width of 4. 

e srci has domain immediate. 

e srci has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0: CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src0 src1 cycles latency temporaries | comments 

i4 i4 i4 1 1 hardware 
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Instruction set description 


CSX600 Instruction Set 


pred.naorb 


Operands: dst, src0, srcil 


Description: 


predicate dst = (invers 


of predicate src0) | predicate srcl. 
Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e srcO has domain immediate. 

e srcO has width of 4. 

e srci has domain immediate. 

e srci has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0:CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src0 src1 cycles latency temporaries | comments 

i4 i4 i4 1 1 hardware 
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CSX600 Instruction Set 


Instruction set description 


pred.naandb 


Operands: dst, src0, srcil 


Description: 


predicate dst = (invers 


of predicate src0) & predicate srcl. 
Constraints: 

e dst has domain immediate. 

e dst has width of 4. 

e srcO has domain immediate. 

e srcO has width of 4. 

e srci has domain immediate. 

e srci has width of 4. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0: CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 

dst src0 src1 cycles latency temporaries | comments 

i4 i4 i4 1 1 hardware 
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Instruction set description CSX600 Instruction Set 


pred.get 


212 


Operands: dst 


Description: 


copies predicates into dst. 
Constraints: 

e dst has domain mono. 

e dst has width of 2. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0:CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 


Details: 


dst cycles latency temporaries | comments 


m2 1 1 hardware 
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CSX600 Instruction Set 
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pred.put 
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Operands: src 


Description: 


copies src into predicates. 


Constraints: 


e src has domain mono or immediate. 
e src has width of 2. 


Notes: The status register is included in the 16 predicate bits. The constants which refer to 
individual predicates are in predicate_constants.inc. The immediates used to denote the 
predicate should indicate a single predicate for the operation, i.e. the predicate immediate is 
not a bit mask, but a single value referencing a single predicate The immediate values for 
the various predicate bits are as follows 0:CarryorFPUinexact 1:Zero 2: MSBor 
FPU underflow 3: Overflow 4:Negative 5: True, always set 6-15: User defined 
Details: 


src cycles latency temporaries | comments 
m2 1 1 hardware 
i2 1 1 hardware 
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Instruction set description CSX600 Instruction Set 


3.1.8 Branch instructions 
These instructions perform branches to other addresses. Branches may be conditional. 
Note that there is a 2 cycle delay if the branch is taken over and above the cost of the 
instruction. There are also instructions for making variable argument function calls. Macro 


branch instructions can be used by including branch.inc. 
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CSX600 Instruction Set Instruction set description 


Operands: addr 


Description: 


Branch to address addr. 
Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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216 


j.1f.pred 


Operands: pred, addr 


Description: 


Branch to address addr if the predicate pred is set to l. 
Constraints: 

e pred has domain immediate. 

e pred has type integer. 

e pred has width of 4. 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 
Notes: The immediate values for the various predicate bits are as follows 0: Carry or 


FPU inexact 1:Zero 2:MSBorFPUunderflow 3:QOverflow 4:Negative 5: True, 
always set 6-15: User defined 


Details: 

pred addr cycles latency temporaries | comments 
i4us m4us 2 2 hardware 
i4us il4us 2 2 hardware 
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j.if.npred 


Operands: pred, addr 


Description: 


Branch to address addr if the predicate pred is set to 0. 
Constraints: 

e pred has domain immediate. 

e pred has type integer. 

e pred has width of 4. 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 
Notes: The immediate values for the various predicate bits are as follows 0: Carry or 


FPU inexact 1:Zero 2:MSBorFPUunderflow 3:QOverflow 4:Negative 5: True, 
always set 6-15: User defined 


Details: 

pred addr cycles latency temporaries | comments 

i4us m4us 2 2 hardware 

i4us il4us 2 2 hardware 
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CSX600 Instruction Set 


j.1£.preds.combined.or 


Operands: predl1, invertil, pred2, invert2, addr 


Description: 


Branch to address addr if the bitwise or of th 


and pred2 is one, inverting either bit with invertl and invert2. 


Constraints: 


predi has domain immediate. 
predi has type integer. 

predi has width of 4. 

inverti has domain immediate. 
inverti has type integer. 
inverti has width of 4. 

pred2 has domain immediate. 
pred2 has type integer. 

pred2 has width of 4. 

invert2 has domain immediate. 
invert2 has type integer. 
invert2 has width of 4. 

addr has domain mono or immediate or label. 
addr has type integer. 

addr has width of 4. 


2 predicates predl 


Notes: The immediate values for the various predicate bits are as follows 0: Carry or 
FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 
Details: 
pred1 invert1 pred2 invert2 addr cycles latency temporaries 
i4us i4us i4us i4us m4us 2 
i4us i4us i4us i4us i14us 2 
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j.1£.preds.combined.and 


Operands: pred1, invertil, pred2, invert2, addr 


Description: 


Branch to address addr if the bitwise and of th 


and pred2 


Constraints: 


is one, 


2 predicates predl 


inverting either bit with invertl and invert2. 


e predi has domain immediate. 


e predi has type integer. 
e predi has width of 4. 
e inverti has domain immediate. 


e invert1 has type integer. 
e invert1 has width of 4. 
e pred2 has domain immediate. 


e pred2 has type integer. 
e pred2 has width of 4. 
e invert2 has domain immediate. 


e invert2 has type integer. 
e invert2 has width of 4. 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Notes: The immediate values for the various predicate bits are as follows 0: Carry or 
FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 
Details: 
pred1 invert1 pred2 invert2 addr cycles latency temporaries 
i4us i4us i4us i4us m4us 2 
i4us i4us i4us i4us i14us 2 
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j.if.cry 


Operands: addr 


Description: 


Branch to address addr if carry flag is set to l. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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CSX600 Instruction Set Instruction set description 


j.if.ncry 


Operands: addr 


Description: 


Branch to address addr if carry flag is set to 0. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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j.1£.zero 


Operands: addr 


Description: 


Branch to address addr if zero flag is set to l. 


Constraints: 


e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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CSX600 Instruction Set Instruction set description 


j.1f.nzero 


Operands: addr 


Description: 


Branch to address addr if zero flag is set to 0. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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j.if.neg 


Operands: addr 


Description: 


Branch to address addr if negative flag is set to l. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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j.if.nneg 


Operands: addr 


Description: 


Branch to address addr if negative flag is set to 0. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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j.1£.msb 


Operands: addr 


Description: 


Branch to address addr if msb flag is set to l. 
Constraints: 


e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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CSX600 Instruction Set Instruction set description 


j.1£.nmsb 


Operands: addr 


Description: 


Branch to address addr if msb flag is set to 0. 
Constraints: 


e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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j.1f.ovft 


Operands: addr 


Description: 


Branch to address addr if overflow flag is set to l. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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CSX600 Instruction Set Instruction set description 


j3.1f.novft 


Operands: addr 


Description: 


Branch to address addr if overflow flag is set to 0. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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j.if.lt (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO < srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us 6 6 macro 

m4f m4f£ il4us 6 6 macro 

m4ft aL dL Aae m4us 11 11|4 mono macro 

m4f i114£ i14us 11 11|4 mono macro 
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j.if.1lt (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO < srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us 6 6 macro 

m8£ m8£ il4us 6 6 macro 

m8f TALQ GE m4us 15 15|8 mono macro 

m8£ i18f il4us 15 15|8 mono macro 
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j.1f.1t (unsigned) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO < srcl. 
Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2u m2u m4us 3 3 macro 

m2u m2u il4us 3 3 macro 
m2u il2u m4us 4 4 macro 

m2u il2u il4us 4 4 macro 
m4u m4u m4us 4 4 macro 

m4u m4u il4us 4 4 macro 
m4u il4u m4us 6 6 macro 

m4u il4u il4us 6 6 macro 
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j.1f.1t (signed) 


Operands: src0O, srcl, addr 


Description: 


Branch to address addr if srcO < srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2s m2s m4us 3 3 macro 
m2s m2s il4us 3 3 macro 
m2s L128 m4us 4 4 macro 
m2s il2s il4us 4 4 macro 
m4s m4s m4us 4 4 macro 
m4s m4s il4us 4 4 macro 
m4s il4s m4us 6 6 macro 
m4s il4s il4us 6 6 macro 
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j.if.le (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO <= srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us 6 6 macro 

m4f m4f£ il4us 6 6 macro 

m4ft aL dL Aae m4us 11 11|4 mono macro 

m4f i114£ i14us 11 11|4 mono macro 
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j.if.le (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO <= srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us 6 6 macro 

m8£ m8£ il4us 6 6 macro 

m8f TALQ GE m4us 15 15|8 mono macro 

m8£ i18f il4us 15 15|8 mono macro 
Document No. 06-RM-1137 Revision: 3.A 235 


ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


j.1f.le (unsigned) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO <= srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2u m2u m4us 3 3 macro 

m2u m2u il4us 3 3 macro 
m2u alee m4us 4 4 macro 

m2u il2u il4us 4 4 macro 
m4u m4u m4us 4 4 macro 

m4u m4u il4us 4 4 macro 
m4u i14u m4us 6 6 macro 

m4u il4u il4us 6 6 macro 
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CSX600 Instruction Set Instruction set description 


j.1f.le (signed) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO <= srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2s m2s m4us 3 3 macro 
m2s m2s il4us 3 3 macro 
m2s L128 m4us 4 4 macro 
m2s il2s il4us 4 4 macro 
m4s m4s m4us 4 4 macro 
m4s m4s il4us 4 4 macro 
m4s il4s m4us 6 6 macro 
m4s il4s il4us 6 6 macro 
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j.if.eq (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO == srcl. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us 6 6 macro 

m4f m4f£ il4us 6 6 macro 

m4ft aL dL Aae m4us 11 11|4 mono macro 

m4f i114£ i14us 11 11|4 mono macro 
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CSX600 Instruction Set Instruction set description 


j.if.eq (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO == srcl. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us 6 6 macro 

m8£ m8£ il4us 6 6 macro 

m8f TALQ GE m4us 15 15|8 mono macro 

m8£ i18f il4us 15 15|8 mono macro 
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j.1if.eq 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO == srcl. 
Constraints: 
e src0 has domain mono. 
e srcO has type integer. 
e srci has domain mono or immediate or label. 
e srci has type integer. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 sre1 addr cycles latency temporaries | comments 
m2us m2us m4us 3 3 macro 
m2us m2us il4us 3 3 macro 
m2us il2us m4us 4 4 macro 
m2us il2us il4us 4 4 macro 
m4us m4us m4us 4 4 macro 
m4us m4us il4us 4 4 macro 
m4us il4us m4us 6 6 macro 
m4us il4us il4us 6 6 macro 
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CSX600 Instruction Set Instruction set description 


j.if.ne (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO != srcl. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us 6 6 macro 

m4f m4f£ il4us 6 6 macro 

m4ft aL dL Aae m4us 11 11|4 mono macro 

m4f i114£ i14us 11 11|4 mono macro 
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j.if.ne (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO != srcl. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us 6 6 macro 

m8£ m8£ il4us 6 6 macro 

m8f TALQ GE m4us 15 15|8 mono macro 

m8£ i18f il4us 15 15|8 mono macro 
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CSX600 Instruction Set Instruction set description 


j.if.ne 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO != srcl. 


Constraints: 


srcO has domain mono. 

srcO has type integer. 

srci has domain mono or immediate or label. 
srci has type integer. 

addr has domain mono or immediate or label. 
addr has type integer. 

addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 sre1 addr cycles latency temporaries | comments 
m2us m2us m4us 3 3 macro 
m2us m2us il4us 3 3 macro 
m2us il2us m4us 4 4 macro 
m2us il2us il4us 4 4 macro 
m4us m4us m4us 4 4 macro 
m4us m4us il4us 4 4 macro 
m4us il4us m4us 6 6 macro 
m4us il4us il4us 6 6 macro 
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j.if.gt (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO > srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us 6 6 macro 

m4f m4f£ il4us 6 6 macro 

m4ft aL dL Aae m4us 11 11|4 mono macro 

m4f i114£ i14us 11 11|4 mono macro 
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CSX600 Instruction Set Instruction set description 


j.if.gt (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO > srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us 6 6 macro 

m8£ m8£ il4us 6 6 macro 

m8f TALQ GE m4us 15 15|8 mono macro 

m8£ i18f il4us 15 15|8 mono macro 
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Instruction set description CSX600 Instruction Set 


j.1f.gt (unsigned) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO > srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2u m2u m4us 3 3 macro 

m2u m2u il4us 3 3 macro 
m2u alee m4us 4 4 macro 

m2u il2u il4us 4 4 macro 
m4u m4u m4us 4 4 macro 

m4u m4u il4us 4 4 macro 
m4u i14u m4us 6 6 macro 

m4u il4u il4us 6 6 macro 
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CSX600 Instruction Set Instruction set description 


j.1f.gt (signed) 


Operands: src0O, srcl, addr 


Description: 


Branch to address addr if srcO > srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2s m2s m4us 3 3 macro 
m2s m2s il4us 3 3 macro 
m2s L128 m4us 4 4 macro 
m2s il2s il4us 4 4 macro 
m4s m4s m4us 4 4 macro 
m4s m4s il4us 4 4 macro 
m4s il4s m4us 6 6 macro 
m4s il4s il4us 6 6 macro 
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j.if.ge (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if src0O >= srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us 6 6 macro 

m4f m4f£ il4us 6 6 macro 

m4ft aL dL Aae m4us 11 11|4 mono macro 

m4f i114£ i14us 11 11|4 mono macro 
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CSX600 Instruction Set Instruction set description 


j.if.ge (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if src0O >= srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us 6 6 macro 

m8£ m8£ il4us 6 6 macro 

m8f TALQ GE m4us 15 15|8 mono macro 

m8£ i18f il4us 15 15|8 mono macro 
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Instruction set description CSX600 Instruction Set 


j.1f.ge (unsigned) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if src0O >= srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2u m2u m4us 3 3 macro 

m2u m2u il4us 3 3 macro 
m2u alee m4us 4 4 macro 

m2u il2u il4us 4 4 macro 
m4u m4u m4us 4 4 macro 

m4u m4u il4us 4 4 macro 
m4u i14u m4us 6 6 macro 

m4u il4u il4us 6 6 macro 
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CSX600 Instruction Set Instruction set description 


j.1f.ge (signed) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO >= srcl. 
Constraints: 
e srcO has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2s m2s m4us 3 3 macro 
m2s m2s il4us 3 3 macro 
m2s L128 m4us 4 4 macro 
m2s il2s il4us 4 4 macro 
m4s m4s m4us 4 4 macro 
m4s m4s il4us 4 4 macro 
m4s il4s m4us 6 6 macro 
m4s il4s il4us 6 6 macro 
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Instruction set description CSX600 Instruction Set 


3.1f 

Operands: cond, addr 
Description: 
Branch to address addr if cond is non-zero. 
Constraints: 

e cond has type integer. 

e cond has domain mono. 

e addr has domain mono or immediate or label. 

e addr has width of 4. 
Side Effects: 

e Leaves the status register in an undefined state. 
Details: 
cond addr cycles latency temporaries | comments 
m2us m4u 3 3 macro 
m2us il4u 3 3 macro 
m4us m4u 4 4 macro 
m4us i14u 4 4 macro 
m2us m4s 3 3 macro 
m2us il4s 3 3 macro 
m4us m4s 4 4 macro 
m4us il4s 4 4 macro 
m4us m4f 4 4 macro 
m4us il4f 4 4 macro 
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CSX600 Instruction Set Instruction set description 


j.ifn 


Operands: cond, addr 


Description: 


Branch to address addr if cond is zero. 
Constraints: 
e cond has type integer. 
e cond has domain mono. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 


e Leaves the status register in an undefined state. 
Details: 


cond addr cycles latency temporaries | comments 


m2us m4us macro 


m2us i14us macro 


m4us m4us macro 


FB WO WwW 
FB WO WwW 


m4us il14us macro 
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Instruction set description CSX600 Instruction Set 


j.sub 
Operands: addr 
Description: 
Branch to address addr, saving the return address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 
return.get 
instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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CSX600 Instruction Set Instruction set description 


j.sub.if.pred 


Operands: pred, addr 


Description: 


Branch to address addr if the predicate pred is set to 1, saving the 
return address. 


Constraints: 
e pred has domain immediate. 
e pred has type integer. 
e pred has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The immediate values for the various predicate bits are as follows 0: Carry or 


FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 


Details: 

pred addr cycles latency temporaries | comments 

i4us m4us 2 2 hardware 

i4us il4us 2 2 hardware 
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Instruction set description CSX600 Instruction Set 


256 


j.sub.if.npred 


Operands: pred, addr 


Description: 


Branch to address addr if the predicate pred is set to 0, saving the 
return address. 


Constraints: 
e pred has domain immediate. 
e pred has type integer. 
e pred has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The immediate values for the various predicate bits are as follows 0: Carry or 


FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 


Details: 

pred addr cycles latency temporaries | comments 
i4us m4us 2 2 hardware 
i4us il4us 2 2 hardware 
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CSX600 Instruction Set 


Instruction set description 


j.sub.if.preds.combined.or 


Operands: pred1, invertil, pred2, invert2, addr 


Description: 


Branch to address addr if the bitwise or of th 
inverting either bit with invertl and invert2, 


and pred2 


is one, 


saving the return address. 


Constraints: 


e predi has domain immediate. 


e predi has type integer. 
e predi has width of 4. 
e inverti has domain immediate. 


e invert1 has type integer. 
e invert1 has width of 4. 
e pred2 has domain immediate. 


e pred2 has type integer. 
e pred2 has width of 4. 
e invert2 has domain immediate. 


e invert2 has type integer. 
e invert2 has width of 4. 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Notes: The immediate values for the various predicate bits are as follows 


2 predicates predl 


0 : Carry or 


FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 
Details: 
pred1 invert1 pred2 invert2 addr cycles latency temporaries 
i4us i4us i4us i4us m4us 2 
i4us i4us i4us i4us il4us 2 
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Instruction set description 


CSX600 Instruction Set 


j.sub.if.preds.combined.and 


Operands: pred1, inverti, pred2, invert2, addr 


Description: 


Branch to address addr if the bitwise and of th 
inverting either bit with invertl and invert2, 


and pred2 


is one, 


saving the return address. 


Constraints: 


e predi has domain immediate. 


e predi has type integer. 
e predi has width of 4. 
e inverti has domain immediate. 


e invert1 has type integer. 
e invert1 has width of 4. 
e pred2 has domain immediate. 


e pred2 has type integer. 
e pred2 has width of 4. 
e invert2 has domain immediate. 


e invert2 has type integer. 
e invert2 has width of 4. 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Notes: The immediate values for the various predicate bits are as follows 


2 predicates predl 


0 : Carry or 


FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 
Details: 
pred1 invert1 pred2 invert2 addr cycles latency temporaries 
i4us i4us i4us i4us m4us 2 
i4us i4us i4us i4us il4us 2 
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CSX600 Instruction Set Instruction set description 


j.sub.if.cry 


Operands: addr 
Description: 
Branch to address addr if carry flag is set to 1, saving the return 
address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 2 2 hardware 
i14us 2 2 hardware 
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Instruction set description 


CSX600 Instruction Set 


260 


j.sub.if.ncry 


Operands: addr 


Description: 


Branch to address addr if carry flag is set to 0, saving the return 


address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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CSX600 Instruction Set Instruction set description 


j.sub.if.zero 


Operands: addr 
Description: 
Branch to address addr if zero flag is set to 1, saving the return 
address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 2 2 hardware 
i14us 2 2 hardware 
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CSX600 Instruction Set 


262 


j.sub.if.nzero 


Operands: addr 


Description: 


Branch to address addr if zero flag is set to 0, saving the return 


address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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CSX600 Instruction Set Instruction set description 


j.sub.if.neg 


Operands: addr 
Description: 
Branch to address addr if negative flag is set to 1, saving the 
return address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 2 2 hardware 
i14us 2 2 hardware 
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Instruction set description CSX600 Instruction Set 


j.sub.if.nneg 


Operands: addr 
Description: 
Branch to address addr if negative flag is set to 0, saving the 
return address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 2 2 hardware 
i14us 2 2 hardware 
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CSX600 Instruction Set Instruction set description 


j.sub.if.msb 


Operands: addr 
Description: 
Branch to address addr if msb flag is set to 1, saving the return 
address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 2 2 hardware 
i14us 2 2 hardware 
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CSX600 Instruction Set 


j.sub.if.nmsb 


266 


Operands: addr 


Description: 


Branch to address addr if msb flag is set to 0, saving the return 


address. 


Constraints: 


e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Notes: The address saved can be branched to with the 


return 


instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us hardware 
il4us hardware 
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CSX600 Instruction Set Instruction set description 


j.sub.if.ovf 


Operands: addr 
Description: 
Branch to address addr if overflow flag is set to 1, saving the 
return address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 2 2 hardware 
i14us 2 2 hardware 
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j.sub.if.novf 


Operands: addr 
Description: 
Branch to address addr if overflow flag is set to 0, saving the 
return address. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 2 2 hardware 
i14us 2 2 hardware 
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CSX600 Instruction Set 


Instruction set description 


j.sub.if.1t (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO < srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us macro 

m4ft m4ft i1l4us macro 
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Instruction set description 


CSX600 Instruction Set 


j.sub.if.1t (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO < srcecl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us macro 

m8 f£ m8 f£ i1l4us macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if.1t (unsigned) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO < srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2u m2u m4us 3 3 macro 

m2u m2u i14us 3 3 macro 

m4u m4u m4us 4 4 macro 

m4u m4u i14us 4 4 macro 
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j.sub.if.1lt (signed) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO < srcecl, saving the return address. 
Constraints: 

e src0 has domain mono. 

e srcO has type signed. 

e srci has domain mono or immediate or label. 

e srci has type signed. 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2s m2s m4us 3 3 macro 

m2s m2s i14us 3 3 macro 

m4s m4s m4us 4 4 macro 

m4s m4s i14us 4 4 macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if.le (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO <= srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us macro 

m4ft m4ft i1l4us macro 
Document No. 06-RM-1137 Revision: 3.A 273 


ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


j.sub.if.le (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO <= srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us macro 

m8 f£ m8 f£ i1l4us macro 

274 Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set 


Instruction set description 


j.sub.if.le (unsigned) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO <= srcl, 


Constraints: 


srcO has domain mono. 


srcO has type unsigned. 


srci has domain mono or immediate or label. 


srci has type unsigned. 


addr has domain mono or immediate or label. 


addr has type integer. 
addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 


instruction or read into registers with the 


return.get 


saving the return address. 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2u m2u m4us 3 3 macro 
m2u m2u il4us 3 3 macro 
m4u m4u m4us 4 4 macro 
m4u m4u il4us 4 4 macro 


Document No. 06-RM-1137 Revision: 3.A 


275 


ClearSpeed Technology plc 


Instruction set description 


CSX600 Instruction Set 


j.sub.if.le (signed) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO <= srcl, 


Constraints: 


srcO has domain mono. 

srcO has type signed. 

srci has domain mono or immediate or label. 
srcl1 has type signed. 

addr has domain mono or immediate or label. 
addr has type integer. 

addr has width of 4. 


saving the return address. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 


instruction or read into registers with the 


return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2s m2s m4us 3 3 macro 

m2s m2s i14us 3 3 macro 

m4s m4s m4us 4 4 macro 

m4s m4s i14us 4 4 macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if.eq (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO == srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srcl has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us macro 

m4ft m4ft i1l4us macro 
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Instruction set description CSX600 Instruction Set 


j.sub.if.eq (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO == srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us macro 

m8 f£ m8 f£ i1l4us macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if.eq 


Operands: src0O, srcl, addr 


Description: 


Branch to address addr if srcO == srcl, saving the return address. 
Constraints: 

e src0 has domain mono. 

e srcO has type integer. 

e srcl has domain mono or immediate or label. 

e srci has type integer. 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2us m2us m4us 3 3 macro 

m2us m2us i14us 3 3 macro 

m4us m4us m4us 4 4 macro 

m4us m4us i14us 4 4 macro 
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Instruction set description CSX600 Instruction Set 


j.sub.if.ne (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO != srcel, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srcl has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us macro 

m4ft m4ft i1l4us macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if.ne (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO != srcel, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us macro 

m8 f£ m8 f£ i1l4us macro 
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Instruction set description CSX600 Instruction Set 


j.sub.if.ne (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO != srcel, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srcl has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us macro 

m4ft m4ft i1l4us macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if.ne (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO != srcel, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us macro 

m8 f£ m8 f£ i1l4us macro 
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Instruction set description 


CSX600 Instruction Set 


j.sub.if.ne 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if src0 


Constraints: 


srcO has domain mono. 

srcO has type integer. 

srci has domain mono or immediate or label. 
srci has type integer. 

addr has domain mono or immediate or label. 
addr has type integer. 

addr has width of 4. 


'= srcl, saving the return address. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 


instruction or read into registers with the 


return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2us m2us m4us 3 3 macro 

m2us m2us i14us 3 3 macro 

m4us m4us m4us 4 4 macro 

m4us m4us i14us 4 4 macro 
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CSX600 Instruction Set 


Instruction set description 


j.sub.if.gt (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us macro 

m4ft m4ft i1l4us macro 
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Instruction set description 


CSX600 Instruction Set 


j.sub.if.gt (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srcl has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us macro 

m8 f£ m8 f£ i1l4us macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if.gt (unsigned) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO > srcl, saving the return address. 
Constraints: 

e src0 has domain mono. 

e srcO has type unsigned. 

e srci has domain mono or immediate or label. 

e srci has type unsigned. 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2u m2u m4us 3 3 macro 

m2u m2u i14us 3 3 macro 

m4u m4u m4us 4 4 macro 

m4u m4u i14us 4 4 macro 
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Instruction set description CSX600 Instruction Set 


j.sub.if.gt (signed) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO > srcl, saving the return address. 
Constraints: 

e src0 has domain mono. 

e srcO has type signed. 

e srci has domain mono or immediate or label. 

e srci has type signed. 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2s m2s m4us 3 3 macro 

m2s m2s i14us 3 3 macro 

m4s m4s m4us 4 4 macro 

m4s m4s i14us 4 4 macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if.ge (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO >= srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m4f m4f m4us macro 

m4ft m4ft i1l4us macro 
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Instruction set description CSX600 Instruction Set 


j.sub.if.ge (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if srcO >= srcl, saving the return address. 
Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The address saved 
can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m8f m8f m4us macro 

m8 f£ m8 f£ i1l4us macro 
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CSX600 Instruction Set 


Instruction set description 


j.sub.if.ge (unsigned) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if src0 


Constraints: 


srcO has domain mono. 


srcO has type unsigned. 


>= srcl, 


srci has domain mono or immediate or label. 


srci has type unsigned. 


addr has domain mono or immediate or label. 


addr has type integer. 
addr has width of 4. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 


instruction or read into registers with the 


return.get 


saving the return address. 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2u m2u m4us 3 3 macro 
m2u m2u il4us 3 3 macro 
m4u m4u m4us 4 4 macro 
m4u m4u il4us 4 4 macro 
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Instruction set description 


CSX600 Instruction Set 


j.sub.if.ge (signed) 


Operands: src0, srcl, addr 


Description: 


Branch to address addr if src0 


Constraints: 


srcO has domain mono. 

srcO has type signed. 

srci has domain mono or immediate or label. 
srcl1 has type signed. 

addr has domain mono or immediate or label. 
addr has type integer. 

addr has width of 4. 


>= srcl, 


saving the return address. 


Notes: The status is updated as if src1 had been subtracted from src0. The address saved 
can be branched to with the 


return 


instruction or read into registers with the 


return.get 


instruction. The second operand as stated in the documentation can be an immediate or 
label and has the same cycle timing as if it was a mono register. 


Details: 

src0 src1 addr cycles latency temporaries | comments 
m2s m2s m4us 3 3 macro 

m2s m2s i14us 3 3 macro 

m4s m4s m4us 4 4 macro 

m4s m4s i14us 4 4 macro 
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CSX600 Instruction Set Instruction set description 


j.sub.if 


Operands: cond, addr 

Description: 

Branch to address addr if condis non-zero, saving the return 
address. 

Constraints: 

e cond has type integer. 

e cond has domain mono. 

e addr has domain mono or immediate or label. 

e addr has width of 4. 

Side Effects: 

e Leaves the status register in an undefined state. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
cond addr cycles latency temporaries | comments 
m2us m4u 3 3 macro 
m2us il4u 3 3 macro 
m4us m4u 4 4 macro 
m4us i14u 4 4 macro 
m2us m4s 3 3 macro 
m2us il4s 3 3 macro 
m4us m4s 4 4 macro 
m4us il4s 4 4 macro 
m4us m4f 4 4 macro 
m4us il4f 4 4 macro 
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Instruction set description CSX600 Instruction Set 


j.sub.ifn 


294 


Operands: cond, addr 


Description: 


Branch to address addr if condis zero, saving the return address. 
Constraints: 

e cond has type integer. 

e cond has domain mono. 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 

Side Effects: 

e Leaves the status register in an undefined state. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 

Details: 

cond addr cycles latency temporaries | comments 
m2us m4us 3 3 macro 
m2us il4us 3 3 macro 
m4us m4us 4 4 macro 
m4us il4us 4 4 macro 
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CSX600 Instruction Set Instruction set description 


return 


Operands: 


Description: 


Branch to address saved. 

Notes: The address saved is set by an instruction prefixed with 
j.sub 

orbya 

return. put 


instruction. 
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Instruction set description CSX600 Instruction Set 


return.get 


296 


Operands: dst 


Description: 


Sets dst to be the return save address. 
Constraints: 
e dst has width of 4. 
e dst has domain mono. 
e dst has type unsigned. 
Notes: This allows the saving of the current return address before further 
j.sub 


style instructions, which will overwrite the return address. The addess saved can then be 
branched to directly. This uses the instructions 


return.get.low 
and 


return.get.high 


Details: 
dst cycles latency temporaries | comments 
m4u 2 2 macro 
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CSX600 Instruction Set Instruction set description 


return.get.low 


Operands: dst 


Description: 


Sets dst to be the low half of the return save address. 
Constraints: 

e dst has width of 2. 

e dst has domain mono. 

e dst has type unsigned. 
Notes: This allows the saving of the current return address before further 
j.sub 


style instructions, which will overwrite the return address. The address saved can then be 
branched to directly. 


Details: 
dst cycles latency temporaries | comments 
m2u 1 1 hardware 
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Instruction set description 


CSX600 Instruction Set 


return.get.high 


298 


Operands: dst 


Description: 


Sets dst to be the high half of the return save address. 


Constraints: 


e dst has width of 2. 


e dst has domain mono. 


e dst has type unsigned. 


Notes: This allows the saving of the current return address before further 


j.sub 


style instructions, which will overwrite the return address. The address saved can then be 
branched to directly. 


Details: 
dst cycles latency temporaries | comments 
m2u 1 1 hardware 
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CSX600 Instruction Set Instruction set description 


return.put 


Operands: src 


Description: 


Sets the return save address to src. 
Constraints: 
e src has width of 4. 
e src has domain mono. 
e src has type unsigned. 
Notes: Future 
return 


instructions will use this address. 


Details: 
src cycles latency temporaries | comments 
m4u 2 2 macro 
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Instruction set description 


CSX600 Instruction Set 


300 


return.put.low 


Operands: dst 


Description: 


Sets the return save low address to src. 
Constraints: 
e dst has width of 2. 
e dst has domain mono. 
e dst has type unsigned. 
Notes: Future 
return 


instructions will use this address. 


Details: 
dst cycles latency temporaries | comments 
m2u 1 1 hardware 
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CSX600 Instruction Set Instruction set description 


return.put.high 


Operands: dst 


Description: 


Sets the return save high address to src. 
Constraints: 

e dst has width of 2. 

e dst has domain mono. 

e dst has type unsigned. 


Notes: Future return instructions will use this address. 


Details: 
dst cycles latency temporaries | comments 
m2u 1 1 hardware 
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Instruction set description CSX600 Instruction Set 


icache.prefetch 


Operands: addr 


Description: 


Prefetch addr as if a branch to that address was taken. 
Constraints: 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 


Notes: This can be used to ensure that an address is in the instruction cache before a 
branch is taken. 


Details: 

addr cycles latency temporaries | comments 
m4us 2 2 hardware 
il4us 2 2 hardware 
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CSX600 Instruction Set 


Instruction set description 


set.varargs 


Operands: mono offset, poly offset, arg0 


Description: 


Creates a varargs stack and places operand argO on it. 


Constraints: 


e mono_offset has domain immediate. 


e¢ mono_offset has type integer. 


e mono_offset has width of 4. 


e poly_offset has type integer. 


e poly_offset has domain immediate. 


e poly_offset has width of 4. 


e argO has domain mono or poly. 


Notes: Further varargs can be appended using 


set.vararg 


and then called with the 


call.vararg 


instruction. The mono_ offset and poly_offset preserve the space on the stacks that are 


currently being used for local variables. 


Details: 

mono_ offset| poly_offset | arg0 cycles latency temporaries | comments 
i4us i4us m2 28 28|4 mono macro 

i4us i4us m4 26 26 macro 

i4us i4us pl 24 24) 4 poly macro 

i4us i4us p2 24 24) 4 poly macro 

i4us i4us p4 20 20 macro 
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Instruction set description CSX600 Instruction Set 


set.varargs 
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Operands: mono offset, poly offset, arg0, argl 


Description: 


Creates a varargs stack and places operand argO and argl on it. 
Constraints: 

¢ mono_offset has domain immediate. 

¢ mono_offset has type integer. 

¢ mono_offset has width of 4. 

e poly_offset has type integer. 

e poly_offset has domain immediate. 

e poly_offset has width of 4. 

e argO has domain mono or poly. 

e argi has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 
call.vararg 


instruction. The specific instructions shown below are only examples, as a mix of widths 
and types of the arguments is allowed. The mono_ offset and poly_offset preserve the space 
on the stacks that are currently being used for local variables. 
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CSX600 Instruction Set Instruction set description 


set.varargs 


Operands: mono offset, poly offset, arg0, argl, arg2 
Description: 
Creates a varargs stack and places operand arg0, argl and arg2 on 
Lt. 
Constraints: 
¢ mono_offset has domain immediate. 
e¢ mono_offset has type integer. 
¢ mono_offset has width of 4. 
e poly_offset has type integer. 
e poly_offset has domain immediate. 
e poly_offset has width of 4. 
e argO has domain mono or poly. 
e argi has domain mono or poly. 
e arg2 has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 
call.vararg 


instruction. The specific instructions shown below are only examples, as a mix of widths 
and types of the arguments is allowed. The mono_ offset and poly_offset preserve the space 
on the stacks that are currently being used for local variables. 
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set.varargs 
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Operands: mono offset, poly offset, arg0, argl, arg2, arg3 
Description: 
Creates a varargs stack and places operand argO, argl, arg2 and arg3 
on Lt. 
Constraints: 
¢ mono_offset has domain immediate. 
e¢ mono_offset has type integer. 
¢ mono_offset has width of 4. 
e poly_offset has type integer. 
e poly_offset has domain immediate. 
e poly_offset has width of 4. 
e argO has domain mono or poly. 
e¢ argi has domain mono or poly. 
e arg2 has domain mono or poly. 
e arg3 has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 
call.vararg 


instruction. The specific instructions shown below are only examples, as a mix of widths 
and types of the arguments is allowed. The mono_ offset and poly_offset preserve the space 
on the stacks that are currently being used for local variables. 
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set.varargs 


Operands: mono offset, poly offset, arg0, argl, arg2, arg3, arg4 
Description: 
Creates a varargs stack and places operand arg0, argl, arg2, arg3 
and arg4 on it. 
Constraints: 
¢ mono_offset has domain immediate. 
¢ mono_offset has type integer. 
¢ mono_offset has width of 4. 
e poly_offset has type integer. 
e poly_offset has domain immediate. 
e poly_offset has width of 4. 
e argO has domain mono or poly. 
e argi has domain mono or poly. 
e arg2 has domain mono or poly. 
e arg3 has domain mono or poly. 
e¢ arg4 has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 
call.vararg 


instruction. The specific instructions shown below are only examples, as a mix of widths 
and types of the arguments is allowed. The mono_ offset and poly_offset preserve the space 
on the stacks that are currently being used for local variables. 
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CSX600 Instruction Set 


set.varargs 
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Operands: arg0 


Description: 


Creates a varargs stack and places operand argO on it. 


Constraints: 


e argO has domain mono or poly. 


Notes: Further varargs can be appended using 


set.vararg 
and then called with the 


call.vararg 


instruction. 

Details: 

argO cycles latency temporaries | comments 
m2 28 28|4 mono macro 

m4 26 26 macro 

pl 24 24) 4 poly macro 

p2 24 24] 4 poly macro 

p4 20 20 macro 
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CSX600 Instruction Set Instruction set description 


set.varargs 


Operands: arg0, arg 


Description: 


Creates a varargs stack and places operand argO and argl on it. 
Constraints: 
e argO has domain mono or poly. 
e argi has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 
call.vararg 


instruction. The specific instructions shown below are only examples, as a mix of widths 
and types of the arguments is allowed. 
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Instruction set description CSX600 Instruction Set 


set.varargs 


Operands: arg0, argl, arg2 
Description: 
Creates a varargs stack and places operand arg0, argl and arg2 on 
Lt. 
Constraints: 

e argO has domain mono or poly. 

e argi has domain mono or poly. 

e arg2 has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 
call.vararg 


instruction. The specific instructions shown below are only examples, as a mix of widths 
and types of the arguments is allowed. 


310 Document No. 06-RM-1137 Revision: 3.A 
ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


set.varargs 


Operands: arg0, argl, arg2, arg3 
Description: 
Creates a varargs stack and places operand argO, argl, arg2 and arg3 
on Lt. 
Constraints: 

e argO has domain mono or poly. 

e argi has domain mono or poly. 

e arg2 has domain mono or poly. 

e arg3 has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 
call.vararg 


instruction. The specific instructions shown below are only examples, as a mix of widths 
and types of the arguments is allowed. 


Document No. 06-RM-1137 Revision: 3.A 311 
ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


set.varargs 
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Operands: arg0, argl, arg2, arg3, arg4 
Description: 
Creates a varargs stack and places operand arg0, argl, arg2, arg3 
and arg4 on it. 
Constraints: 

e argO has domain mono or poly. 

e argi has domain mono or poly. 

e arg2 has domain mono or poly. 

e arg3 has domain mono or poly. 

e arg4 has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 
call.vararg 


instruction. The specific instructions shown below are only examples, as a mix of widths 
and types of the arguments is allowed. 
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CSX600 Instruction Set Instruction set description 


set.vararg 


Operands: arg 


Description: 


Appends arg0 to the vararg stack. 
Constraints: 

e arg has domain mono or poly. 
Notes: Further varargs can be appended using 
set.vararg 
and then called with the 


call.vararg 


instruction 
Details: 
arg cycles latency temporaries | comments 
m2 28 28|4 mono macro 
m4 26 26 macro 
pl 24 24) 4 poly macro 
p2 24 24] 4 poly macro 
p4 20 20 macro 
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Instruction set description CSX600 Instruction Set 


set. fixedarg 


Operands: arg 


Description: 


Appends argOQ to the normal fixed args. 
Constraints: 
e arg has domain mono or poly. 
Notes: This is used when additional fixed args are needed when using the 
call.vararg 


instruction. The initial fixed arguments should go in registers, as per the abi. These 
parameters will appear on the stack before any in the call.vararg instruction. 
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CSX600 Instruction Set Instruction set description 


call.varargs 


Operands: label 


Description: 


Make the call to the variable argument function. 
Constraints: 

e label has domain mono or immediate or label. 

e label has type integer. 

e label has width of 4. 


Details: 
label cycles latency temporaries | comments 
m4us 26 26|4 mono macro 
il4us 26 26|4 mono macro 
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call.varargs 
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Operands: label, arg0 


Description: 


Make the call to the variable argument function. 
Constraints: 

e label has domain mono or immediate or label. 

e label has type integer. 

e label has width of 4. 

e argO has domain mono or poly. 


Notes: The arguments are appended to the fixed argument stack and the function is then 
called. These fixed arguments are ones that do not fit into the parameter passing registers. 
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CSX600 Instruction Set Instruction set description 


call.varargs 


Operands: label, arg0, argl 


Description: 


Creates a varargs stack and places operand argO and argl on it. 


Constraints: 


label has domain mono or immediate or label. 
label has type integer. 

label has width of 4. 

argO has domain mono or poly. 

argi has domain mono or poly. 


Notes: The arguments are appended to the fixed argument stack and the function is then 
called. These fixed arguments are ones that do not fit into the parameter passing registers. 
The specific instructions shown below are only examples, as a mix of widths and types of 
the arguments is allowed. 


Document No. 06-RM-1137 Revision: 3.A 317 


ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


call.varargs 
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Operands: label, arg0, argl, arg2 


Description: 


Creates a varargs stack and places operand arg0, argl and arg2 on 


it. 


Constraints: 


label has domain mono or immediate or label. 
label has type integer. 

label has width of 4. 

argO has domain mono or poly. 

argi has domain mono or poly. 

arg2 has domain mono or poly. 


Notes: The arguments are appended to the fixed argument stack and the function is then 
called. These fixed arguments are ones that do not fit into the parameter passing registers. 
The specific instructions shown below are only examples, as a mix of widths and types of 
the arguments is allowed. 
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CSX600 Instruction Set Instruction set description 


call.varargs 


Operands: label, arg0, argl, arg2, arg3 


Description: 


Creates a varargs stack and places operand argQ, argl, arg2 and arg3 
on 2t. 


Constraints: 


label has domain mono or immediate or label. 
label has type integer. 

label has width of 4. 

argO has domain mono or poly. 

argi has domain mono or poly. 

arg2 has domain mono or poly. 

arg3 has domain mono or poly. 


Notes: The arguments are appended to the fixed argument stack and the function is then 
called. These fixed arguments are ones that do not fit into the parameter passing registers. 
The specific instructions shown below are only examples, as a mix of widths and types of 
the arguments is allowed. 
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call.varargs 


320 


Operands: label, arg0, argl, arg2, arg3, arg4 


Description: 


Creates a varargs stack and places operand arg0, argl, arg2, arg3 
and arg4 on it. 


Constraints: 


label has domain mono or immediate or label. 
label has type integer. 

label has width of 4. 

argO has domain mono or poly. 

argi has domain mono or poly. 

arg2 has domain mono or poly. 

arg3 has domain mono or poly. 

arg4 has domain mono or poly. 


Notes: The arguments are appended to the fixed argument stack and the function is then 
called. These fixed arguments are ones that do not fit into the parameter passing registers. 
The specific instructions shown below are only examples, as a mix of widths and types of 
the arguments is allowed. 
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jx 
Operands: addr 
Description: 
Relative branch to address addr. If the address specified is an 
immediate, the value has to be the desired number of instructions to 
jump -2. E.g. To jump to the next instruction, as if no jump has 
taken place, the immediate value should be -1. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Details: 
addr cycles latency temporaries | comments 
il4us 1 1 hardware 
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jr.if.pred 
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Operands: pred, addr 


Description: 


Relative branch to address addr if the predicate pred is set to l. 
If the address specified is an immediate, the value has to be the 
desired number of instructions to jump -2. E.g. To jump to the next 
instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e pred has domain immediate. 
e pred has type integer. 
e pred has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The immediate values for the various predicate bits are as follows 0: Carry or 
FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 


Details: 

pred addr cycles latency temporaries | comments 
i4us m4us 1 1 hardware 
i4us il4us 1 1 hardware 
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CSX600 Instruction Set Instruction set description 


jr.if.npred 


Operands: pred, addr 


Description: 


Relative branch to address addr if the predicate pred is set to 0. 
If the address specified is an immediate, the value has to be the 
desired number of instructions to jump -2. E.g. To jump to the next 
instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e pred has domain immediate. 
e pred has type integer. 
e pred has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: The immediate values for the various predicate bits are as follows 0: Carry or 
FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 


Details: 

pred addr cycles latency temporaries | comments 

i4us m4us 1 1 hardware 

i4us il4us 1 1 hardware 
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CSX600 Instruction Set 


324 


jr.if.cry 


Operands: addr 


Description: 


Relative branch to address addr if carry flag is set to 1. If the 
address specified is an immediate, the value has to be the desired 
number of instructions to jump -2. E.g. To jump to the next 


instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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CSX600 Instruction Set Instruction set description 


jr.if.ncry 


Operands: addr 


Description: 


Relative branch to address addr if carry flag is set to 0. If the 
address specified is an immediate, the value has to be the desired 
number of instructions to jump -2. E.g. To jump to the next 


instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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jr.if.zero 


Operands: addr 


Description: 


Relative branch to address addr if zero flag is set to l. 
address specified is an immediate, the valu 


If the 


has to be the desired 


number of instructions to jump -2. E.g. To jump to the next 


instruction, as if no jump has taken place, 


the immediat 


should be -1. 

Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 


valu 
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CSX600 Instruction Set Instruction set description 


jr.if.nzero 


Operands: addr 


Description: 


Relative branch to address addr if zero flag is set to 0. If the 
address specified is an immediate, the value has to be the desired 
number of instructions to jump -2. E.g. To jump to the next 


instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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jr.if.neg 


Operands: addr 


Description: 


Relative branch to address addr if neg flag is set to 1. If the 


address specified is an immediate, the valu 


has to be the desired 


number of instructions to jump -2. E.g. To jump to the next 


instruction, as if no jump has taken place, 


the immediate valu 


should be -1. 

Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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CSX600 Instruction Set 


Instruction set description 


jr.if.nneg 


Operands: addr 


Description: 


Relative branch to address addr if neg flag is set to 0. 
address specified is an immediate, the valu 


number of instructions to jump -2. E.g. To jump to the next 
instruction, as if no jump has taken place, 


the immediat 


should be -1. 

Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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jr.if.msb 


Operands: addr 


Description: 


Relative branch to address addr if msb flag is set to 1. If the 


address specified is an immediate, the valu 
number of instructions to jump -2. E.g. 
instruction, 


should be -1l. 


Constraints: 


has to be the desired 
To jump to the next 


a. 


as if no jump has taken place, the immediate valu 


e addr has domain mono or immediate or label. 
e addr has type integer. 


e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 hardware 
il4us 1 hardware 
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CSX600 Instruction Set Instruction set description 


jr.if.nmsb 


Operands: addr 


Description: 


Relative branch to address addr if msb flag is set to 0. If the 
address specified is an immediate, the valu 
number of instructions to jump -2. E.g. 


a. 


has to be the desired 


To jump to the next 
instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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jr.if.ovf 


Operands: addr 


Description: 


Relative branch to address addr if overflow flag is set to 1. If the 
address specified is an immediate, the value has to be the desired 
number of instructions to jump -2. E.g. To jump to the next 


instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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CSX600 Instruction Set Instruction set description 


jr.if.novf 


Operands: addr 


Description: 


Relative branch to address addr if overflow flag is set to 0. If the 
address specified is an immediate, the value has to be the desired 
number of instructions to jump -2. E.g. To jump to the next 


instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO < srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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CSX600 Instruction Set Instruction set description 


jr.if.lt (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO < srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO < srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srcl has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


jr.if.1lt (signed) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO < srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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jr.if.le (float (32 bit) ) 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO <= srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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CSX600 Instruction Set Instruction set description 


jr.if.le (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO <= srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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jr.if.le (unsigned) 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO <= srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srcl has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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CSX600 Instruction Set Instruction set description 


jr.if.le (signed) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO <= srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO == srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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jr.if.eq (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO == srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0O, srcl, addr 


Description: 


Relative branch to address addr if srcO == srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srcl has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO == srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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jr.if.ne (float (64 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srcl has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO > srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO > srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srcl has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


jr.if.gt (signed) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO > srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO >= srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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jr.iff.ge (float (64 bit)) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO >= srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO >= srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srcl has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO >= srcl. If the address 
specified is an immediate, the value has to be the desired number of 
instructions to jump -2. E.g. To jump to the next instruction, as if 
no jump has taken place, the immediate value should be -1. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srci has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. 
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jr.sub 


Operands: addr 


Description: 


Relative branch to address addr, saving the return address. If the 
address specified is an immediate, the value has to be the desired 
number of instructions to jump -2. E.g. To jump to the next 


instruction, as if no jump has taken place, the immediate valu 
should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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Operands: pred, addr 


Description: 


Relative branch to address addr if the predicate pred is set to l, 
saving the return address. If the address specified is an immediate, 
the value has to be the desired number of instructions to jump -2. 
E.g. To jump to the next instruction, as if no jump has taken place, 
the immediate value should be -1. 


Constraints: 
e pred has domain immediate. 
e pred has type integer. 
e pred has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The immediate values for the various predicate bits are as follows 0: Carry or 


FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 


Details: 

pred addr cycles latency temporaries | comments 
i4us m4us 1 1 hardware 
i4us il4us 1 1 hardware 
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jr.sub.if.npred 


Operands: pred, addr 


Description: 


Relative branch to address addr if the predicate pred is set to 0, 
saving the return address. If the address specified is an immediate, 
the value has to be the desired number of instructions to jump -2. 
E.g. To jump to the next instruction, as if no jump has taken place, 
the immediate value should be -1. 


Constraints: 
e pred has domain immediate. 
e pred has type integer. 
e pred has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The immediate values for the various predicate bits are as follows 0: Carry or 


FPU inexact 1:Zero 2:MSBorFPUunderflow 3: Overflow 4:Negative 5: True, 
always set 6-15: User defined 


Details: 

pred addr cycles latency temporaries | comments 
i4us m4us 1 1 hardware 
i4us il4us 1 1 hardware 
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jr.sub.if.cry 


Operands: addr 


Description: 


Relative branch to address addr if carry flag is set to 1, saving 
the return address. If the address specified is an immediate, the 
value has to be the desired number of instructions to jump -2. E.g. 
To jump to the next instruction, as if no jump has taken place, the 
immediate value should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 1 1 hardware 
i14us 1 1 hardware 
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Operands: addr 


Description: 


Relative branch to address addr if carry flag is set to 0, saving 


the return address. If the address specified is an immediate, 
value has to be the desired number of instructions to jump -2. 


the 


EB 
ely 


«0% 
To jump to the next instruction, as if no jump has taken place, the 


immediate value should be -1. 
Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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Operands: addr 


Description: 


Relative branch to address addr if zero flag is set to 1, saving the 
return address. If the address specified is an immediate, the valu 
has to be the desired number of instructions to jump -2. E.g. To 
jump to the next instruction, as if no jump has taken place, the 
immediate value should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 1 1 hardware 
i14us 1 1 hardware 
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Operands: addr 


Description: 


Relative branch to address addr if zero flag is set to 0, saving the 
return address. If the address specified is an immediate, the valu 
has to be the desired number of instructions to jump -2. E.g. To 
jump to the next instruction, as if no jump has taken place, the 
immediate value should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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jr.sub.if.neg 


Operands: addr 


Description: 


Relative branch to address addr if neg flag is set to 1, saving the 
return address. If the address specified is an immediate, the valu 
has to be the desired number of instructions to jump -2. E.g. To 
jump to the next instruction, as if no jump has taken place, the 
immediate value should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 1 1 hardware 
i14us 1 1 hardware 
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Operands: addr 


Description: 


Relative branch to address addr if neg flag is set to 0, saving the 
return address. If the address specified is an immediate, the valu 
has to be the desired number of instructions to jump -2. E.g. To 
jump to the next instruction, as if no jump has taken place, the 
immediate value should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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Instruction set description 


jr.sub.if.msb 


Operands: addr 


Description: 


Relative branch to address addr if msb flag is set to l, 


return address. 


saving the 


If the address specified is an immediate, the valu 
has to be the desired number of instructions to jump -2. E.g. To 
jump to the next instruction, as if no jump has taken place, the 
immediate value should be -1. 

Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 
return.get 
instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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Operands: addr 


Description: 


Relative branch to address addr if msb flag is set to 0, 


return address. 


has to be the desired number of instructions to jump -2. 


address specifi 


jump to the next instruction, 
immediate value should be -1. 


Constraints: 


dis an immediat 


saving the 
, the valu 


e addr has domain mono or immediate or label. 


e addr has type integer. 
e addr has width of 4. 


Notes: The address saved can be branched to with the 


return 


instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us 1 hardware 
il4us 1 hardware 


= 


E.G. TO 


as if no jump has taken place, the 
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jr.sub.if.ovf 


Operands: addr 


Description: 


Relative branch to address addr if overflow flag is set to 1, saving 
the return address. If the address specified is an immediate, the 
value has to be the desired number of instructions to jump -2. E.g. 
To jump to the next instruction, as if no jump has taken place, th 
immediate value should be -1. 


S 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 
Details: 
addr cycles latency temporaries | comments 
m4us 1 1 hardware 
i14us 1 1 hardware 
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Operands: addr 


Description: 


Relative branch to address addr if overflow flag is set to 0, saving 
the return address. If the address specified is an immediate, the 
value has to be the desired number of instructions to jump -2. E.g. 
To jump to the next instruction, as if no jump has taken place, the 
immediate value should be -1. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Notes: The address saved can be branched to with the 
return 
instruction or read into registers with the 


return.get 


instruction. 

Details: 

addr cycles latency temporaries | comments 
m4us 1 1 hardware 
il4us 1 1 hardware 
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Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO < srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.1t (float (64 bit) ) 


372 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO < srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.1t (unsigned) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO < srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.1t (signed) 


374 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO < srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srcl has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.le (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO <= srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.le (float (64 bit) ) 


376 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO <= srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.le (unsigned) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO <= srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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jr.sub.if.le (signed) 


378 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO <= srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srcl has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.eq (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO == srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.eq (float (64 bit) ) 


380 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO == srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.eq (unsigned) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO == srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.eq (signed) 


382 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO == srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srcl has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.ne (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.ne (float (64 bit) ) 


384 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.ne (unsigned) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.ne (signed) 


386 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srcl has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.gt (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO > srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.gt (float (64 bit) ) 


388 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO > srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.ne (unsigned) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO != srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.gt (signed) 


390 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO > srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srcl has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.ge (float (32 bit) ) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO >= srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 4. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 4. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0O. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.ge (float (64 bit) ) 


392 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO >= srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type float. 
e srcO has width of 8. 
e srci has domain mono or immediate or label. 
e srci has type float. 
e srci has width of 8. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


jr.sub.if.ge (unsigned) 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO >= srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type unsigned. 
e srci has domain mono or immediate or label. 
e srci has type unsigned. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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Instruction set description CSX600 Instruction Set 


jr.sub.if.ge (signed) 


394 


Operands: src0, srcl, addr 


Description: 


Relative branch to address addr if srcO >= srcl, saving the return 
address. If the address specified is an immediate, the value has to 
be the desired number of instructions to jump -2. E.g. To jump to 
the next instruction, as if no jump has taken place, the immediat 
value should be -1l. 


Constraints: 
e src0 has domain mono. 
e srcO has type signed. 
e srci has domain mono or immediate or label. 
e srcl has type signed. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
Side Effects: 
e Updates the status register. 


Notes: The status is updated as if src1 had been subtracted from src0. The second operand 
as stated in the documentation can be an immediate or label and has the same cycle timing 
as if it was a mono register. The address saved can be branched to with the 


return 
instruction or read into registers with the 
return.get 


instruction. 
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CSX600 Instruction Set Instruction set description 


3.1.9 Loads and Store instructions 


These instructions perform load and stores between registers and memory of the same 
domain. Macro load store instructions can be used by including Idst.inc. 
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ld (mono, no offset) 


Operands: dst, addr 
Description: 
Loads into the mono registers dst, width(dst) bytes from address 
addr. 
Constraints: 
e dst has domain mono. 
e dst has maximum width of 32. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: addr is assumed to be aligned to the width of dst or 16. 


Details: 
dst addr cycles latency temporaries | comments 
m1 m4us 6 6 hardware 
m1 il4us 10 10) 4 mono macro 
m2 m4us 6 6 hardware 
m2 il4us 10 10) 4 mono macro 
m4 m4us 6 6 hardware 
m4 il4us 10 10|4 mono macro 
m8 m4us 6 6 hardware 
m8 il4us 10 10/4 mono macro 
m16 m4us 6 6 hardware 
m16 il4us 10 10) 4 mono macro 
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CSX600 Instruction Set 


Instruction set description 


ld (mono with offset) 


Operands: dst, addr, offset 


Description: 


Loads into the mono registers dst, width (dst) 
(addrt+toffset). 


bytes from address 


Constraints: 
e dst has domain mono. 
e dst has maximum width of 32. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
e offset has domain mono or immediate. 
e offset has type integer. 
e offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of dst or 16. Offset 
may be no more than 65535 and is treated as unsigned. 


Details: 

dst addr offset cycles latency temporaries | comments 
m1 m4us m2us 6 6 hardware 
m1 m4us i2us 6 6 hardware 
m1 m4us 128 us 6 6 hardware 
m1 il4us m2 10 10) 4 mono macro 
m1 il4us 2 10 10) 4 mono macro 
m1 il4us 128 us 10 10) 4 mono macro 
m2 m4us m2 6 6 hardware 
m2 m4us i2 6 6 hardware 
m2 m4us 128 ws 6 6 hardware 
m2 il4us m2 10 10) 4 mono macro 
m2 il4us 12 10 10) 4 mono macro 
m2 il4us 128 us 10 10) 4 mono macro 
m4 m4us m2 6 6 hardware 
m4 m4us i2 6 6 hardware 
m4 m4us 128 ws 6 6 hardware 
m4 il4us m2 10 10) 4 mono macro 
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Instruction set description CSX600 Instruction Set 


Details: 

dst addr offset cycles latency temporaries | comments 
m4 il4us i2us 10 10|4 mono macro 

m4 il4us 128 us 10 10|4 mono macro 

m8 m4us m2us 6 6 hardware 
m8 m4us i2us 6 6 hardware 
m8 m4us 128 us 6 6 hardware 
m8 il4us m2us 10 10|4 mono macro 

m8 il4us i2us 10 10|4 mono macro 

m8 il4us 128 us 10 10|4 mono macro 
m16 m4us m2us 6 6 hardware 
m16 m4us i2us 6 6 hardware 
m16 m4us 128 us 6 6 hardware 
m16 il4us m2us 10 10|4 mono macro 
m16 il4us i2us 10 10|4 mono macro 
m16 il4us 128 us 10 10|4 mono macro 
m32 m4us m2us 14 14 macro 

m32 il4us m2us 18 18] 4 mono macro 
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CSX600 Instruction Set Instruction set description 


ld (poly, no offset) 


Operands: dst, addr 


Description: 


Loads into the poly registers dst, width(dst) bytes at address addr. 
Constraints: 

e dst has domain poly. 

e dst has maximum width of 32. 

e addr has type integer. 

e addr has width of 2. 


Notes: addr is assumed to be aligned to the width of dst or 16. 


Details: 

dst addr cycles latency temporaries | comments 

jp. |m2us | £41f °° £&4'Tf ~~” fhardware | 

pl il2us 2 2 hardware 

p2 m2us 1 1 hardware 

p2 il2us 1 1 hardware 

p4 m2us 1 1 hardware 

p4 il2us 1 1 hardware 

ps m2us 2 2 hardware 

ps il2us 2 2 hardware 

p16 m2us 4 4 hardware 

pl6é il2us 4 4 hardware 

p32 m2us 8 8 hardware 

p32 il2us 8 8 hardware 

pl p2us 2 2 macro 

p2 p2us 2 2 macro 

p4 p2us 2 2 macro 

p8 p2us 3 3 macro 

p16 p2us 5 5 macro 

p32 p2us 9 9 macro 
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Instruction set description 


CSX600 Instruction Set 


ld (poly with offset) 


Operands: dst, addr, offset 


Description: 


Loads into the poly registers dst, width(dst) bytes from address 
(addrt+toffset). 


Constraints: 
e dst has domain poly. 
e dst has maximum width of 32. 
e addr has domain mono or poly. 
e addr has type integer. 
e addr has width of 2. 
e offset has domain immediate. 
e offset has width of 4. 


Notes: The maximum offset is 127. Macros exist to support higher offsets but these break 
down into an add to the mono address parameter of the immediate, followed by a load 
operation, using the modified mono address register as the address, followed by a subtract 
from the mono address register of the immediate offset. In the below table, entries with an 
immediate of 128 mean 128 or greater and represent the macro here discussed 


Details: 

dst addr offset cycles latency temporaries | comments 
pl m2us i4 1 1 hardware 
pl m2us 128 3 3 macro 

p2 m2us i4 1 1 hardware 
p2 m2us 128 3 3 macro 

p4 m2us i4 1 1 hardware 
p4 m2us 128 3 3 macro 

p8 m2us i4 2 2 hardware 
p8 m2us 128 4 4 macro 

p16 m2us i4 4 4 hardware 
p16 m2us 128 6 6 macro 
p32 m2us i4 8 8 hardware 
p32 m2us 128 10 10 macro 

pl p2us i4 2 2 macro 
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CSX600 Instruction Set Instruction set description 


Details: 

dst addr offset cycles latency temporaries | comments 
p2 p2us i4 2 2 macro 

p4 p2us i4 2 2 macro 

ps8 p2us i4 3 3 macro 

p16 p2us i4 5 5 macro 

p32 p2us i4 9 9 macro 
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Instruction set description 


CSX600 Instruction Set 


fld (poly, no offset) 


402 


Operands: dst, addr 


Description: 


Loads into the poly registers dst, width(dst) bytes at address addr, 
regardless of enable state. 


Constraints: 


dst has domain poly. 


dst has maximum width of 32. 


addr has type integer. 
addr has width of 2. 


Notes: addr is assumed to be aligned to the width of dst or 16. 


Details: 

dst addr cycles latency temporaries | comments 
pl m2us 1 1 hardware 
pl i12u 2 2 hardware 
p2 m2us 1 1 hardware 
p2 i12u 1 1 hardware 
p4 m2us 1 1 hardware 
p4 i12u 1 1 hardware 
ps m2us 2 2 hardware 
ps i12u 2 2 hardware 
p16 m2us 4 4 hardware 
pl6é i12u 4 4 hardware 
p32 m2us 8 8 hardware 
p32 il2u 8 8 hardware 
pl p2us 2 2 macro 
p2 p2us 2 2 macro 
p4 p2us 2 2 macro 
p8 p2us 3 3 macro 
p16 p2us 5 5 macro 
p32 p2us 9 9 macro 
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CSX600 Instruction Set 


Instruction set description 


fld (poly with offset) 


Operands: dst, addr, offset 


Description: 


Loads into the poly registers dst, width(dst) bytes from address 
(addr+toffset), regardless of enable state. 


Constraints: 
e dst has domain poly. 
e dst has maximum width of 32. 
e addr has domain mono or poly. 
e addr has type integer. 
e addr has width of 2. 
e offset has domain immediate. 
e offset has width of 2. 


Notes: The maximum offset is 127. Macros exist to support higher offsets but these break 
down into an add to the mono address parameter of the immediate, followed by a forced 
load operation, using the modified mono address register as the address, followed by a 
subtract from the mono address register of the immediate offset. In the below table, entries 
with an immediate of 128 mean 128 or greater and represent the macro here discussed 


Details: 

dst addr offset cycles latency temporaries | comments 
pl m2us 12 1 1 hardware 
pl m2us 128 3 3 macro 

p2 m2us 12 1 1 hardware 
p2 m2us 128 3 3 macro 

p4 m2us 12 1 1 hardware 
p4 m2us 128 3 3 macro 

p8 m2us a2 2 2 hardware 
p8 m2us 128 4 4 macro 

p16 m2us i 4 4 hardware 
p16 m2us 128 6 6 macro 
p32 m2us 12 8 8 hardware 
p32 m2us 128 10 10 macro 

pl p2us 12 2 2 macro 
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Instruction set description 


CSX600 Instruction Set 


Details: 

dst addr offset cycles latency temporaries | comments 
p2 p2us i2 2 2 macro 

p4 p2us 2 2 2 macro 

ps8 p2us i2 3 3 macro 

p16 p2us a2 5 5 macro 

p32 p2us 12 9 9 macro 
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CSX600 Instruction Set Instruction set description 


st (mono, no offset) 


Operands: addr, src 


Description: 


store src at mono address addr. 


Constraints: 


e addr has type integer. 
e addr has width of 4. 
e addr has domain mono or immediate or label. 


e src has domain mono. 


e src has maximum width of 32. 


Notes: addr is assumed to be aligned to the width of src or 16. 


Details: 

addr src cycles latency temporaries | comments 
m4us m2 6 6 hardware 
m4us m1 6 6 hardware 
il4us m2 10 10|4 mono macro 
il4us m1 10 10|4 mono macro 
m4us m4 6 6 hardware 
il4us m4 10 10|4 mono macro 
m4us m8 6 6 hardware 
il4us m8 10 10|4 mono macro 
m4us m16 6 6 hardware 
il4us m16 10 10|4 mono macro 
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Instruction set description CSX600 Instruction Set 


st (mono with offset) 


Operands: addr, src, offset 


Description: 


store src at mono address (addrt+offset). 
Constraints: 

e addr has type integer. 

e addr has width of 4. 

e addr has domain mono or immediate or label. 

e src has domain mono. 

e src has maximum width of 32. 

e offset has domain mono or immediate. 

e offset has type integer. 

e offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of src or 16. Offset 
may be no more than 65535 and is treated as unsigned. 


Details: 

addr src offset cycles latency temporaries | comments 
m4us m1 m2us 6 6 hardware 
m4us m1 i2us 6 6 hardware 
m4us m1 128 us 6 6 hardware 
m4us m2 m2us 6 6 hardware 
m4us m2 i2us 6 6 hardware 
m4us m2 128 us 6 6 hardware 
m4us m4 m2us 6 6 hardware 
m4us m4 i2us 6 6 hardware 
m4us m4 128 us 6 6 hardware 
m4us m8 m2us 6 6 hardware 
m4us m8 i2us 6 6 hardware 
m4us m8 128 us 6 6 hardware 
m4us m16 m2us 6 6 hardware 
m4us m16 i2us 6 6 hardware 
m4us m16 128 us 6 6 hardware 
m4us m32 m2us 14 14 macro 
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CSX600 Instruction Set Instruction set description 


st (poly, no offset) 


Operands: addr, src 


Description: 


stores src at address addr. 
Constraints: 

e addr has type integer. 

e addr has width of 2. 

e src has domain poly. 

e src has maximum width of 32. 


Notes: addr is assumed to be aligned to the width of src. The poly temporaries supplied for 
the 1 byte variant of this instruction need to be in the low half of the poly register file. 


Details: 
addr src cycles latency temporaries | comments 
m2us p2 1 1 hardware 
m2us pl 12 12] 2 mono 3 macro 
poly 
Anns p2 1 1 hardware 
il2us pl 13 13) 2 mono 3 macro 
poly 
m2us p4 1 1 hardware 
il2us p4 1 1 hardware 
m2us ps 2 2 hardware 
il2us p8 2 2 hardware 
m2us p16 4 4 hardware 
il2us p16 4 4 hardware 
m2us p32 8 8 hardware 
il2us p32 8 8 hardware 
p2us pl 17 18) 7 poly macro 
p2us p2 2 2 macro 
p2us p4 2 2 macro 
p2us ps 3 3 macro 
p2us p16 5 5 macro 
p2us p32 9 9 macro 
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Instruction set description CSX600 Instruction Set 


st (poly with offset) 


Operands: addr, src, offset 


Description: 


stores src at address (addrt+offset). 
Constraints: 

e addr has type integer. 

e addr has width of 2. 

e src has domain poly. 

e src has maximum width of 32. 

e offset has domain immediate. 

e offset has type integer. 

e offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of src or 16. Offset 
may be no more than 65535 and is treated as unsigned. The poly temporaries supplied for 
the 1 byte variant of this instruction need to be in the low half of the poly register file. The 
maximum offset is 127. Macros exist to support higher offsets but these break down into an 
add to the mono address parameter of the immediate, followed by a store operation, using 
the modified mono address register as the address, followed by a subtract from the mono 
address register of the immediate offset. In the below table, entries with an immediate of 
128 mean 128 or greater and represent the macro here discussed 


Details: 

addr src offset cycles latency temporaries | comments 

m2us pl i2us 13 13) 2 mono 3 macro 
poly 

m2us pl 128 us 13 13) 2 mono 3 macro 
poly 

m2us p2 i2us 1 1 hardware 

m2us p2 128 us 3 3 macro 

m2us p4 i2us 1 1 hardware 

m2us p4 128 us 3 3 macro 

m2us ps i2us 2 2 hardware 

m2us p8 128 us 4 4 macro 

m2us p16 i2us 4 4 hardware 

m2us plé 128 us 6 6 macro 

m2us p32 i2us 8 8 hardware 

m2us p32 128 us 10 10 macro 
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CSX600 Instruction Set Instruction set description 


Details: 

addr src offset cycles latency temporaries | comments 
p2us pl i2us 17 18] 7 poly macro 

p2us p2 i2us 2 2 macro 

p2us p4 i2us 2 2 macro 

p2us ps i2us 3 3 macro 

p2us p16 i2us 5 5 macro 

p2us p32 i2us 9 9 macro 
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Instruction set description CSX600 Instruction Set 


fst (poly, no offset) 


Operands: addr, src 


Description: 


stores src at address addr, regardless of enable state. 
Constraints: 

e addr has type integer. 

e addr has width of 2. 

e src has domain poly. 

e src has maximum width of 32. 


Notes: addr is assumed to be aligned to the width of src. The poly temporaries supplied for 
the 1 byte variant of this instruction need to be in the low half of the poly register file. 


Details: 
addr src cycles latency temporaries | comments 
m2us p2 1 1 hardware 
m2us pl 18 18) 2 mono 4 macro 
poly 
Anns p2 1 1 hardware 
il2us pl 19 19) 2 mono 4 macro 
poly 
m2us p4 1 1 hardware 
il2us p4 1 1 hardware 
m2us ps 2 2 hardware 
il2us p8 2 2 hardware 
m2us p16 4 4 hardware 
il2us p16 4 4 hardware 
m2us p32 8 8 hardware 
il2us p32 8 8 hardware 
p2us pl 23 24] 8 poly macro 
p2us p2 2 2 macro 
p2us p4 2 2 macro 
p2us ps 3 3 macro 
p2us p16 5 5 macro 
p2us p32 9 9 macro 
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CSX600 Instruction Set 


Instruction set description 


fst (poly with offset) 


Operands: addr, src, offset 


Description: 


stores src at address (addrt+offset), regardless of enable state. 
Constraints: 

e addr has type integer. 

e addr has width of 2. 

e src has domain poly. 

e src has maximum width of 32. 

e offset has domain immediate. 

e offset has type integer. 

e offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of src or 16. Offset 
may be no more than 65535 and is treated as unsigned. The poly temporaries supplied for 
the 1 byte variant of this instruction need to be in the low half of the poly register file. The 
maximum offset is 127. Macros exist to support higher offsets but these break down into an 
add to the mono address parameter of the immediate, followed by a forced store operation, 
using the modified mono address register as the address, followed by a subtract from the 
mono address register of the immediate offset. In the below table, entries with an immediate 
of 128 mean 128 or greater and represent the macro here discussed 


Details: 
addr src offset cycles latency temporaries | comments 
m2us pl i2us 19 20|2 mono 4 macro 
poly 
m2us pl 128 us 19 20|2 mono 4 macro 
poly 
m2us p2 i2us 1 1 hardware 
m2us p2 128 us 3 3 macro 
m2us p4 i2us 1 1 hardware 
m2us p4 128 us 3 3 macro 
m2us ps i2us 2 2 hardware 
m2us p8 128 us 4 4 macro 
m2us p16 i2us 4 4 hardware 
m2us plé 128 us 6 6 macro 
m2us p32 i2us 8 8 hardware 
m2us p32 128 us 10 10 macro 
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Instruction set description CSX600 Instruction Set 


Details: 
addr src offset cycles latency temporaries | comments 
p2us pl i2us 23 24) 8 poly macro 
p2us p2 i2us 2 2 macro 
p2us p4 i2us 2 2 macro 
p2us ps i2us 3 3 macro 
p2us p16 i2us 5 5 macro 
p2us p32 i2us 9 9 macro 
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CSX600 Instruction Set Instruction set description 


ld.index (poly, no offset) 


Operands: dst, addr 


Description: 


Loads into the poly registers dst, width(dst) bytes at the address 
defined by the poly index register + addr. 


Constraints: 
e dst has domain poly. 
e dst has maximum width of 32. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 2. 


Notes: The final address is assumed to be aligned to the size of dst or 16. 


Details: 

dst addr cycles latency temporaries | comments 
pl m2us 1 1 hardware 
pl il2us 2 2 hardware 
p2 m2us 1 1 hardware 
p2 il2us 1 1 hardware 
p4 m2us 1 1 hardware 
p4 il2us 1 1 hardware 
ps m2us 2 2 hardware 
p8 il2us 2 2 hardware 
p16 m2us 4 4 hardware 
p16 il2us 4 4 hardware 
Daz m2us 8 8 hardware 
p32 il2us 8 8 hardware 
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Instruction set description 


CSX600 Instruction Set 


ld.index (poly with offset) 


Operands: dst, addr, offset 


Description: 


Loads into the poly registers dst, width(dst) bytes from the address 
defined by the poly index register + offset + addr. 


Constraints: 


dst has domain poly. 


dst has maximum width of 32. 


addr has domain mono. 

addr has width of 2. 

offset has domain immediate. 
offset has width of 2. 


Notes: The final address is assumed to be aligned to the size of dst or 16. 


Details: 
dst addr offset cycles latency temporaries | comments 
pl m2 2 1 1 hardware 
p2 m2 i2 1 1 hardware 
p4 m2 2 1 1 hardware 
p8 m2 i2 2 2 hardware 
p16 m2 12 4 4 hardware 
p32 m2 i2 8 8 hardware 
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CSX600 Instruction Set Instruction set description 


fld.index (poly, no offset) 


Operands: dst, addr 
Description: 
Loads into the poly registers dst, width(dst) bytes at the address 


defined by the poly index register + addr, regardless of enable 
state. 


Constraints: 
e dst has domain poly. 
e dst has maximum width of 32. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 2. 


Notes: The final address is assumed to be aligned to the size of dst or 16. 


Details: 

dst addr cycles latency temporaries | comments 
pl m2us 1 1 hardware 
pl il2us 2 2 hardware 
p2 m2us 1 1 hardware 
p2 il2us 1 1 hardware 
p4 m2us 1 1 hardware 
p4 il2us 1 1 hardware 
ps m2us 2 2 hardware 
p8 il2us 2 2 hardware 
p16 m2us 4 4 hardware 
p16 il2us 4 4 hardware 
p32 m2us 8 8 hardware 
p32 il2us 8 8 hardware 
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Instruction set description 


CSX600 Instruction Set 


fld.index (poly with offset) 


Operands: dst, addr, offset 


Description: 


Loads into the poly registers dst, width(dst) bytes from the address 
defined by the poly index register + offset + addr, regardless of 
enable state. 


Constraints: 


dst has domain poly. 


dst has maximum width of 32. 


addr has domain mono. 

addr has width of 2. 

offset has domain immediate. 
offset has width of 2. 


Notes: The final address is assumed to be aligned to the size of dst or 16. 


Details: 
dst addr offset cycles latency temporaries | comments 
pl m2 a2 1 1 hardware 
p2 m2 i2 1 1 hardware 
p4 m2 a2 1 1 hardware 
p8 m2 i2 2 2 hardware 
p16 m2 12 4 4 hardware 
p32 m2 i2 8 8 hardware 
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CSX600 Instruction Set Instruction set description 


st.index (poly, no offset) 


Operands: addr, src 


Description: 


Stores the poly registers src into the address defined by the poly 
index register + addr. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 2. 
e src has domain poly. 
e src has maximum width of 32. 


Notes: The final address is assumed to be aligned to the size of src or 16. The poly 
temporaries supplied for the 1 byte variant of this instruction need to be in the low half of the 
poly register file. 


Details: 

addr src cycles latency temporaries | comments 

m2us p2 1 1 hardware 

m2us pl 16 17) 7 poly macro 

m2us p4 1 1 hardware 

m2us p8 2 2 hardware 

m2us pl6 4 4 hardware 

m2us p32 8 8 hardware 
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Instruction set description CSX600 Instruction Set 


st.index (poly with offset) 


Operands: addr, src, offset 


Description: 


Stores the poly registers src into the address defined by the poly 
index register + offset + addr. 


Constraints: 
e addr has domain mono. 
e addr has width of 2. 
e src has domain poly. 
e src has maximum width of 32. 
e offset has domain immediate. 
e offset has width of 2. 


Notes: The final address is assumed to be aligned to the size of src or 16. The poly 
temporaries supplied for the 1 byte variant of this instruction need to be in the low half of the 
poly register file. 


Details: 

addr src offset cycles latency temporaries | comments 

m2 pl 12 18 19) 7 poly macro 

m2 p2 i2 1 1 hardware 

m2 p4 a2 1 1 hardware 

m2 p8 i2 2 2 hardware 

m2 pl6é 12 4 4 hardware 

m2 p32 i2 8 8 hardware 
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CSX600 Instruction Set Instruction set description 


fst.index (poly, no offset) 


Operands: addr, src 


Description: 


Stores the poly registers src into the address defined by the poly 
index register + addr, regardless of enable state. 


Constraints: 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 2. 
e src has domain poly. 
e src has maximum width of 32. 


Notes: The final address is assumed to be aligned to the size of src or 16. The poly 
temporaries supplied for the 1 byte variant of this instruction need to be in the low half of the 
poly register file. 


Details: 

addr src cycles latency temporaries | comments 

m2us p2 1 1 hardware 

m2us pl 22 23) 8 poly macro 

m2us p4 1 1 hardware 

m2us p8 2 2 hardware 

m2us pl6 4 4 hardware 

m2us p32 8 8 hardware 
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Instruction set description CSX600 Instruction Set 


fst.index (poly with offset) 


Operands: addr, src, offset 


Description: 


Stores the poly registers src into the address defined by the poly 
index register + offset + addr, regardless of enable state. 


Constraints: 
e addr has domain mono. 
e addr has width of 2. 
e src has domain poly. 
e src has maximum width of 32. 
e offset has domain immediate. 
e offset has width of 2. 


Notes: The final address is assumed to be aligned to the size of src or 16. The poly 
temporaries supplied for the 1 byte variant of this instruction need to be in the low half of the 
poly register file. 


Details: 

addr src offset cycles latency temporaries | comments 

m2 pl 12 24 25] 8 poly macro 

m2 p2 i2 1 1 hardware 

m2 p4 a2 1 1 hardware 

m2 p8 i2 2 2 hardware 

m2 pl6é 12 4 4 hardware 

m2 p32 i2 8 8 hardware 
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CSX600 Instruction Set Instruction set description 


1ls.index.fput 


Operands: src 


Description: 


Sets the poly load/store index register. 


Constraints: 
e src has domain poly. 
e src has width of 2. 
Notes: This will be used in subsequent 
ld.index 
or 


st.index 


instructions and will automatically be set in loads or stores where the address is poly. 


Details: 
src cycles latency temporaries | comments 
p2u 1 1 microcode 
p2s 1 1 microcode 
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Instruction set description 


CSX600 Instruction Set 


422 


1ls.index.fget 


Operands: dst 


Description: 


Gets the poly load/stor 


Constraints: 

e dst has domain poly. 

e dst has width of 2. 
Notes: This is the index used in 
ld.index 
or 


st.index 


index register. 


instructions and will automatically be set in loads or stores where the address is poly. 


Details: 

dst cycles latency temporaries | comments 
p2u 2 microcode 
p2s 2 microcode 
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CSX600 Instruction Set Instruction set description 


1ld.direct (mono, no offset) 


Operands: dst, addr 


Description: 


Loads into the mono registers dst, width(dst) bytes at address addr, 
bypassing the cache. 


Constraints: 
e dst has domain mono. 
e dst has maximum width of 32. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: addr is assumed to be aligned to the width of dst or 16. This is useful for performing 
a wait on the global semaphore unit. 


Details: 

dst addr cycles latency temporaries | comments 

ml m4us 6 6 hardware 

m2 m4us 6 6 hardware 

m4 m4us 6 6 hardware 

m8 m4us 6 6 hardware 

m16 m4us 6 6 hardware 
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Instruction set description CSX600 Instruction Set 


1ld.direct (mono with offset) 


Operands: dst, addr, offset 
Description: 
Loads into the mono registers dst, width(dst) bytes at address 
(addr+offset), bypassing the cache. 
Constraints: 
e dst has domain mono. 
e dst has maximum width of 32. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 
e offset has domain mono or immediate. 
e offset has type integer. 
e offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of dst or 16. This is 
useful for performing a wait on the global semaphore unit. 


Details: 

dst addr offset cycles latency temporaries | comments 
m1 m4us m2us 6 6 hardware 
m1 m4us i2us 6 6 hardware 
m2 m4us m2us 6 6 hardware 
m2 m4us i2us 6 6 hardware 
m4 m4us m2us 6 6 hardware 
m4 m4us i2us 6 6 hardware 
m8 m4us m2us 6 6 hardware 
m8 m4us i2us 6 6 hardware 
m16 m4us m2us 6 6 hardware 
m16 m4us i2us 6 6 hardware 
m32 m4us m2us 14 14 macro 
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CSX600 Instruction Set 


Instruction set description 


st.direct (mono, 


no offset) 


Operands: addr, src 


Description: 


store src at mono address addr, 


Constraints: 


addr has type integer. 
addr has width of 4. 
addr has domain mono or immediate or label. 


src has domain mono. 


src has maximum width of 32. 


bypassing the cache. 


Notes: addr is assumed to be aligned to the width of src or 16. This is useful for 
performing a signal on the global semaphore unit. 


Details: 

addr src cycles latency temporaries | comments 
m4us m2 6 6 hardware 
m4us m1 6 6 hardware 
m4us m4 6 6 hardware 
m4us m8 6 6 hardware 
m4us m16 6 6 hardware 
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Instruction set description CSX600 Instruction Set 


st.direct (mono with offset) 


Operands: addr, src, offset 


Description: 


store src at mono address (addrt+offset), bypassing the cache. 
Constraints: 

e addr has type integer. 

e addr has width of 4. 

e addr has domain mono or immediate or label. 

e src has domain mono. 

e src has maximum width of 32. 

e offset has domain mono or immediate. 

e offset has type integer. 

e offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of src or 16. Offset 
may be no more than 65535 and is treated as unsigned. This is useful for performing a 
signal on the global semaphore unit. 


Details: 

addr src offset cycles latency temporaries | comments 
m4us m1 m2us 6 6 hardware 
m4us m1 i2us 6 6 hardware 
m4us m2 m2us 6 6 hardware 
m4us m2 i2us 6 6 hardware 
m4us m4 m2us 6 6 hardware 
m4us m4 i2us 6 6 hardware 
m4us m8 m2us 6 6 hardware 
m4us m8 i2us 6 6 hardware 
m4us m16 m2us 6 6 hardware 
m4us m16 i2us 6 6 hardware 
m4us m32 m2us 14 14 macro 
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CSX600 Instruction Set Instruction set description 


ld.yield (mono, no offset) 


Operands: dst, addr 
Description: 
Loads into the mono registers dst, width(dst) bytes at address addr, 
yielding the thread until the result returns. 
Constraints: 
e dst has domain mono. 
e dst has maximum width of 32. 
e addr has domain mono or immediate or label. 
e addr has type integer. 
e addr has width of 4. 


Notes: addr is assumed to be aligned to the width of dst or 16. 
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Instruction set description CSX600 Instruction Set 


ld.yield (mono with offset) 


Operands: dst, addr, offset 
Description: 
Loads into the mono registers dst, width(dst) bytes at address 
(addr+offset), yielding the thread until the result returns. 
Constraints: 

e dst has domain mono. 

e dst has maximum width of 32. 

e addr has domain mono or immediate or label. 

e addr has type integer. 

e addr has width of 4. 

e offset has domain mono or immediate. 

e offset has type integer. 

e offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of dst or 16. 


Details: 

dst addr offset cycles latency temporaries | comments 
m1 m4us m2us 6 6 hardware 
m1 m4us i2us 6 6 hardware 
m2 m4us m2us 6 6 hardware 
m2 m4us i2us 6 6 hardware 
m4 m4us m2us 6 6 hardware 
m4 m4us i2us 6 6 hardware 
m8 m4us m2us 6 6 hardware 
m8 m4us i2us 6 6 hardware 
m16 m4us m2us 6 6 hardware 
m16 m4us i2us 6 6 hardware 
m32 m4us m2us 14 14 macro 
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CSX600 Instruction Set Instruction set description 


st.yield (mono, no offset) 


Operands: addr, src 


Description: 


store src at mono address addr, yielding the thread until if the 
cache is full until a slot becomes available. 


Constraints: 
e addr has type integer. 
e addr has width of 4. 
e addr has domain mono or immediate or label. 
e src has domain mono. 
e src has maximum width of 32. 


Notes: addr is assumed to be aligned to the width of src or 16. This is useful for 
performing a signal on the global semaphore unit. 


Details: 

addr src cycles latency temporaries | comments 
m4us m2 6 6 hardware 

m4us ml 6 6 hardware 

m4us m4 6 6 hardware 

m4us m8 6 6 hardware 

m4us m16 6 6 hardware 
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Instruction set description CSX600 Instruction Set 


st.yield (mono with offset) 


Operands: addr, src, offset 


Description: 


store src at mono address (addrtoffset), yielding the thread until 
if the cache is full until a slot becomes available. 


Constraints: 
e addr has type integer. 
e addr has width of 4. 
e addr has domain mono or immediate or label. 
e src has domain mono. 
e src has maximum width of 32. 
e offset has domain mono or immediate. 
e offset has type integer. 
e offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of src or 16. Offset 
may be no more than 65535 and is treated as unsigned. This is useful for performing a 
signal on the global semaphore unit. 


Details: 

addr src offset cycles latency temporaries | comments 
m4us m1 m2us 6 6 hardware 
m4us m1 i2us 6 6 hardware 
m4us m2 m2us 6 6 hardware 
m4us m2 i2us 6 6 hardware 
m4us m4 m2us 6 6 hardware 
m4us m4 i2us 6 6 hardware 
m4us m8 m2us 6 6 hardware 
m4us m8 i2us 6 6 hardware 
m4us m16 m2us 6 6 hardware 
m4us m16 i2us 6 6 hardware 
m4us m32 m2us 14 14 macro 
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CSX600 Instruction Set 


Instruction set description 


ld.direct.yield (mono, no offset) 


Operands: dst, addr 


Description: 


Loads into the mono registers dst, width(dst) bytes at address addr, 
bypassing the cache and yielding the thread until the result 
returns. 


Constraints: 


dst has domain mono. 


dst has maximum width of 32. 


addr has domain mono or immediate or label. 


addr has type integer. 
addr has width of 4. 


Notes: addr is assumed to be aligned to the width of dst or 16. This is useful for performing 
a wait on the global semaphore unit 


Details: 

dst addr cycles latency temporaries | comments 
m1 m4us 6 6 hardware 
m2 m4us 6 6 hardware 
m4 m4us 6 6 hardware 
m8 m4us 6 6 hardware 
m16 m4us 6 6 hardware 
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Instruction set description CSX600 Instruction Set 


ld.direct.yield (mono with offset) 


Operands: dst, addr, offset 


Description: 


Loads into the mono registers dst, width(dst) bytes at address 
(addr+offset), bypassing the cache and yielding the thread until the 
result returns. 


Constraints: 


dst has domain mono. 

dst has maximum width of 32. 

addr has domain mono or immediate or label. 
addr has type integer. 

addr has width of 4. 

offset has domain mono or immediate. 

offset has type integer. 

offset has width of 2. 


Notes: The address addr + offset is assumed to be aligned to the width of dst or 16. This is 
useful for performing a wait on the global semaphore unit. 


Details: 

dst addr offset cycles latency temporaries | comments 
m1 m4us m2us 6 6 hardware 
m1 m4us i2us 6 6 hardware 
m2 m4us m2us 6 6 hardware 
m2 m4us i2us 6 6 hardware 
m4 m4us m2us 6 6 hardware 
m4 m4us i2us 6 6 hardware 
m8 m4us m2us 6 6 hardware 
m8 m4us i2us 6 6 hardware 
m16 m4us m2us 6 6 hardware 
m16 m4us i2us 6 6 hardware 
m32 m4us m2us 14 14 macro 
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CSX600 Instruction Set Instruction set description 


ac.ls.sig 


Operands: sem 


Description: 


Signal the semaphore indicated by the valu 
loads or stores have completed. 


sem after previous poly 


Constraints: 


e sem has domain mono or immediate. 
e sem has width of 2. 


Notes: This is normally used prior to a pio load command to ensure the data has been 
written to poly memory. 


Details: 

sem cycles latency temporaries | comments 
m2 1 1 hardware 
12 1 1 hardware 
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mono.1ls.base.put 
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Operands: src 


Description: 


Sets the mono base register 
Constraints: 
e src has domain mono or immediate. 
e src has width of 4. 


Notes: This sets the top 32 bits of the 64 bit mono address which will be ored with the 
bottom 32 bits provided by any future mono loads or stores. This instruction should be used 
only with great care and is likely to be set once during the duration of the program (to refer 
to a particular segment). 


Details: 

src cycles latency temporaries | comments 
m4u 2 2 macro 

i4u 2 2 macro 

m4s 2 2 macro 

i4s 2 2 macro 

m4f 2 2 macro 

i4f 2 2 macro 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


3.1.10 Swazzle instructions 


These instructions move data between neighboring poly processing elements. Macro 
swazzle instructions can be used by including swazzle.inc. 
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CSX600 Instruction Set 


swazzle.lowtohigh 
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Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 1 higher. 


Constraints: 


e All operands have minimim width of 2. 


e dst has domain poly. 


e dst has equal width to src 


e src has domain poly. 


Notes: The lowest numbered PE will swazzle part of the 8 byte value set in 
swazzle.low.in.put. dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 
or 4:p4, 12:p4, and the value swazzled into swazzle low out will be indexed into the oct e.g. 
0:p4 will be in the low half. For swazzle instructions greater than 8 bytes, the nature of the 
swazzle path means that overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 2 2 microcode 
p4 p4 2 2 microcode 
ps8 ps8 2 2 microcode 
p16 p16 3 3 microcode 
p32 p32 5 5 extension 
microcode 
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CSX600 Instruction Set 


Instruction set description 


swazzle.lowtohighx2 


Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 2 higher. 


Constraints: 


e All operands have minimim width of 2. 


e dst has domain poly. 


e dst has equal width to src 


e src has domain poly. 


Notes: The 2 lowest numbered PE will swazzle in the 8 byte value set in swazzle.low.in.put. 
For swazzle instructions greater than 8 bytes, the nature of the swazzle path means that 
overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 3 extension 
microcode 
p4 p4 3 extension 
microcode 
p8 p8 3 extension 
microcode 
plé plé 5 microcode 
sz oe) 9 extension 
microcode 
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swazzle.lowtohighx4 
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Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 4 higher. 


Constraints: 
e All operands have minimim width of 2. 
e dst has domain poly. 
e dst has equal width to src 
e src has domain poly. 


Notes: The 4 lowest numbered PE will swazzle in the 8 byte value set in swazzle.low.in.put. 
For swazzle instructions greater than 8 bytes, the nature of the swazzle path means that 
overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 5 5 extension 
microcode 
p4 p4 5 5 extension 
microcode 
ps ps 5 5 extension 
microcode 
p16 p16 9 9 microcode 
SZ SZ 17 17 extension 
microcode 
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Instruction set description 


swazzle.hightolow 


Operands: dst, src 


Description: 


Copies sre to dst on PE whose PE number is 1 lower. 


Constraints: 


All operands have minimim width of 2. 


dst has domain poly. 


dst has equal width to src 


src has domain poly. 


Notes: The highest numbered PE will swazzle in part of the 8 byte value set in 
swazzle.high.in.put. dst and src must have the same alignment within the oct, e.g. 0:p2, 
8:p2 or 4:p4, 12:p4, and the value swazzled into swazzle low out will be indexed into the oct 


e.g. 0:p4 would be swazzled into the low half. For swazzle instructions greater than 8 
bytes, the nature of the swazzle path means that overlapping the src and dst will likely 
cause serious problems. 


Details: 

dst src cycles latency temporaries | comments 

p2 p2 2 2 microcode 

p4 p4 2 2 microcode 

p8 p8 2 2 microcode 

p16 p16 3 3 microcode 

SZ SZ 5 5 extension 
microcode 
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CSX600 Instruction Set 


swazzle.hightolowx2 


440 


Operands: dst, src 


Description: 


Copies sre to dst on PE whose PE number is 2 lower. 


Constraints: 


e All operands have minimim width of 2. 


e dst has domain poly. 


e dst has equal width to src 


e src has domain poly. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 


12:p4 . 

Details: 

dst src cycles latency temporaries | comments 

p2 p2 3 3 extension 
microcode 

p4 p4 3 3 extension 
microcode 

p8 p8 3 3 extension 
microcode 

p16 p16 5 5 microcode 

p32 p32 9 9 extension 
microcode 
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CSX600 Instruction Set Instruction set description 


swazzle.hightolowx4 


Operands: dst, src 


Description: 


Copies sre to dst on PE whose PE number is 4 lower. 


Constraints: 
e All operands have minimim width of 2. 
e dst has domain poly. 
e dst has equal width to src 
e src has domain poly. 


Notes: The highest numbered PE will swazzle in part of the 8 byte value set in 
swazzle.high.in.put. For swazzle instructions greater than 8 bytes, the nature of the swazzle 
path means that overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 5 5 extension 
microcode 
p4 p4 5 5 extension 
microcode 
ps ps 5 5 extension 
microcode 
p16 p16 9 9 microcode 
SZ SZ 17 17 extension 
microcode 
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swazzle.swap.even.up 
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Operands: dst, src 


Description: 


On even pe, copies src to dst on PE whose PE number is 1 higher. 
On odd pe, copies src to dst on PE whose PE number is 1 lower. 


Constraints: 
e All operands have minimim width of 2. 
e dst has domain poly. 
e dst has equal width to src 
e src has domain poly. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 
12:p4 . For swazzle instructions greater than 8 bytes, the nature of the swazzle path means 
that overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 2 2 microcode 
p4 p4 2 2 microcode 
p8 ps8 2 2 microcode 
p16 p16 3 3 extension 
microcode 
p32 p32 5 5 extension 
microcode 
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CSX600 Instruction Set Instruction set description 


swazzle.swap.odd.up 


Operands: dst, src 


Description: 


On odd pe, copies sre to dst on PE whose PE number is 1 higher. 
On even pe, copies src to dst on PE whose PE number is 1 lower. 


Constraints: 
e All operands have minimim width of 2. 
e dst has domain poly. 
e dst has equal width to src 
e src has domain poly. 


Notes: The lowest and highest numbered PEs will swazzle in the 8 byte values set in 
swazzle.low.in.put and swazzle.high.in.put. dst and src must have the same alignment 
within the oct, e.g. 0:p2, 8:p2 or 4:p4, 12:p4, and the value swazzled into the swazzle low 
out and high out will be indexed into the oct e.g. 0:p4 would be swazzled into the low half. 
For swazzle instructions greater than 8 bytes, the nature of the swazzle path means that 
overlapping the src and dst will likely cause serious problems. 


Details: 

dst src cycles latency temporaries | comments 

p2 p2 2 2 microcode 

p4 p4 2 2 microcode 

p8 p8 2 2 microcode 

p16 p16 3 3 extension 
microcode 

SZ p32 5 5 extension 
microcode 
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CSX600 Instruction Set 


swazzle.circular.lowtohigh 


444 


Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 1 higher or to PE 0 from 
the highest numbered PE. 


Constraints: 


e All operands have minimim width of 2. 


e dst has domain poly. 


e dst has equal width to src 


e src has domain poly. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 
12:p4 For swazzle instructions greater than 8 bytes, the nature of the swazzle path means 
that overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 2 2 microcode 
p4 p4 2 2 microcode 
p8 p8 2 2 microcode 
p16 p16 3 3 extension 
microcode 
p32 p32 5 5 extension 
microcode 
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CSX600 Instruction Set Instruction set description 


swazzle.circular.lowtohighx2 


Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 2 higher or to PE 1 from 
the highest numbered PE and PE 0 from highest number - 1 


Constraints: 
e All operands have minimim width of 2. 
e dst has domain poly. 
e dst has equal width to src 
e src has domain poly. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 
12:p4 For swazzle instructions greater than 8 bytes, the nature of the swazzle path means 
that overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 3 3 extension 
microcode 
p4 p4 3 3 extension 
microcode 
p8 p8 3 3 extension 
microcode 
p16 p16 5 5 extension 
microcode 
p32 p32 9 9 extension 
microcode 
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swazzle.circular.lowtohighx4 
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Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 4 higher or to PE 3 from 
the highest numbered PE, PE 2 from highest number - 1 etc for 4 PEs 


Constraints: 


All operands have minimim width of 2. 
dst has domain poly. 

dst has equal width to src 

src has domain poly. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 
12:p4 For swazzle instructions greater than 8 bytes, the nature of the swazzle path means 
that overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 5 5 extension 
microcode 
p4 p4 5 5 extension 
microcode 
ps ps 5 5 extension 
microcode 
p16 p16 9 9 extension 
microcode 
p32 jSZ 17 17 extension 
microcode 
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swazzle.circular.hightolow 


Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 1 lower or to the highest 
numbered PE from PE 0. 


Constraints: 
e All operands have minimim width of 2. 
e dst has domain poly. 
e dst has equal width to src 
e src has domain poly. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 
12:p4 For swazzle instructions greater than 8 bytes, the nature of the swazzle path means 
that overlapping the src and dst will likely cause serious problems. 


Details: 

dst src cycles latency temporaries | comments 

p2 p2 2 2 microcode 

p4 p4 2 2 microcode 

p8 ps8 2 2 microcode 

p16 p16 3 3 extension 
microcode 

p32 p32 5 5 extension 
microcode 
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CSX600 Instruction Set 


swazzle.circular.hightolowx2 
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Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 2 lower or to the highest 


numbered PI 


Constraints: 


EF from P 


e All operands have minimim width of 2. 


e dst has domain poly. 


e dst has equal width to src 


e src has domain poly. 


E 1 and highest number -1 from PE 0. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 
12:p4 For swazzle instructions greater than 8 bytes, the nature of the swazzle path means 
that overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 3 3 extension 
microcode 
p4 p4 3 3 extension 
microcode 
p8 p8 3 3 extension 
microcode 
p16 p16 5 5 extension 
microcode 
se a2 9 9 extension 
microcode 
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CSX600 Instruction Set Instruction set description 


swazzle.circular.hightolowx4 


Operands: dst, src 


Description: 


Copies src to dst on PE whose PE number is 4 lower or to the highest 
numbered PE from PE 3 and highest number -1 from PE 2 etc. 


Constraints: 
e All operands have minimim width of 2. 
e dst has domain poly. 
e dst has equal width to src 
e src has domain poly. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 
12:p4 For swazzle instructions greater than 8 bytes, the nature of the swazzle path means 
that overlapping the src and dst will likely cause serious problems. 


Details: 
dst src cycles latency temporaries | comments 
p2 p2 5 5 extension 
microcode 
p4 p4 5 5 extension 
microcode 
ps ps 5 5 extension 
microcode 
p16 p16 9 9 extension 
microcode 
SZ p32 17 17 extension 
microcode 
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Instruction set description CSX600 Instruction Set 


swazzle.circular.swap.odd.up 


Operands: dst, src 


Description: 


On odd pe, copies src to dst on PE whose PE number is 1 higher or to 
PE 0 from highest numbered PE. On even pe, copies src to dst on PE 
whose PE number is 1 lower or to the highest numbered PE from PE 


E 


Constraints: 
e All operands have minimim width of 2. 
e dst has domain poly. 
e dst has equal width to src 
e src has domain poly. 


Notes: dst and src must have the same alignment within the oct, e.g. 0:p2, 8:p2 or 4:p4, 
12:p4 For swazzle instructions greater than 8 bytes, the nature of the swazzle path means 
that overlapping the src and dst will likely cause serious problems. 


Details: 

dst src cycles latency temporaries | comments 

p2 p2 2 2 microcode 

p4 p4 2 2 microcode 

p8 ps8 2 2 microcode 

p16 p16 3 3 extension 
microcode 

p32 p32 5 5 extension 
microcode 
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CSX600 Instruction Set Instruction set description 


swazzle.high.in.put 


Operands: src 


Description: 


Sets the 8 byte value which will be swazzled into the highest number 
PE on a swazzle.hightolow or swazzle.swap.odd.up to be src. 


Constraints: 
e src has domain mono or immediate. 
e src has minimim width of 2. 


Details: 

src cycles latency temporaries | comments 
m2 4 4 macro 

12 4 4 macro 

m4 4 4 macro 

i4 4 4 macro 

m8 4 4 macro 
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Instruction set description CSX600 Instruction Set 


swazzle.low.in.put 


Operands: src 


Description: 


Sets the 8 byte value which will be swazzled into the lowest 
numbered PE on a swazzle.lowtohigh or swazzle.swap.odd.up to be src. 


Constraints: 
e src has domain mono or immediate. 
e src has minimim width of 2. 


Details: 

src cycles latency temporaries | comments 
m2 4 4 macro 

12 4 4 macro 

m4 4 4 macro 

i4 4 4 macro 

m8 4 4 macro 
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CSX600 Instruction Set Instruction set description 


swazzle.high.out.get 


Operands: dst 


Description: 


Gets the value which was swazzled from the highest number PE on a 
swazzle.lowtohigh or swazzle.swap.odd.up. 


Constraints: 
e dst has domain mono. 
e dst has minimim width of 2. 


Notes: There will be an additional delay on the real hardware of 10-20 cycles, as the 
information filters through the hardware pipeline, before the result is available to the user. 


Details: 

dst cycles latency temporaries | comments 
m2 3 3 macro 

m4 4 4 macro 

m8 8 8 macro 
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swazzle.low.out.get 
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Operands: dst 


Description: 


Gets the value which was swazzled from the lowest number PE on a 
swazzle.hightolow or swazzle.swap.odd.up. 


Constraints: 
e dst has domain mono. 
e dst has minimim width of 2. 


Notes: There will be an additional delay on the real hardware of 10-20 cycles, as the 
information filters through the hardware pipeline, before the result is available to the user. 


Details: 

dst cycles latency temporaries | comments 
m2 3 3 macro 

m4 4 4 macro 

m8 8 8 macro 
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3.1.11 Input and Output instructions 


These instructions perform I/O operations between mono memory and poly pes. They are 
likely to be changed in future releases. Note that all of the I/O commands take an optional 
controller index; this is only needed if the system has more than 1 PIO or SIO node. This 
index will start from 0 for both PIO and SIO. The default is always zero. Macro io instructions 
can be used by including io.inc. 
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pio.atomic.start 


456 


Operands: controller 


Description: 


Start of an atomic PIO section. 
Constraints: 

e controller has domain immediate. 

e controller has width of 4. 


Notes: This should be used at the start of a sequence of PIO instructions. It allows the 
scheduler to more efficiently sequence the instructions together. During an atomic section 
no other thread can interrupt (it uses the mutex instruction). The end of the sequence should 
be denoted by a pio.atomic.end. Note that no more than one atomic section can be nested. 
Atomic sections should be fairly short; less than 7 instructions. The optional controller 
argument defaults to 0 (the only pio controller) 
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pio.atomic.yield.start 


Operands: controller 


Description: 


Start of an atomic PIO section. 
Constraints: 

e controller has domain immediate. 

e controller has width of 4. 


Notes: This should be used at the start of a sequence of PIO instructions. It allows the 
scheduler to more efficiently sequence the instructions together. During an atomic section 
no other thread can interrupt (it uses the mutex instruction). The end of the sequence should 
be denoted by a pio.atomic.end. Note that no more than one atomic section can be nested. 
Atomic sections should be fairly short; less than 7 instructions. If the sequence of 
instructions cannot be issued together, then it will yield. The optional controller argument 
defaults to 0 (the only pio controller) 
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pio.atomic.end 


Operands: 


Description: 


End of a PIO atomic section. 


Notes: This should be used at the denote the end of an atomic section. 
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pio.address.data.put 


Operands: controller, addr, size 


Description: 


Loads PIO node with 
enables and data, 


"size' bytes of ‘'addr', which includes address 
on the pio controller, ‘controller’. 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e addr has domain mono. 
e addr has width of 2. 
e size has domain immediate. 
e size has width of 4. 


Notes: The first 4 bytes at address 'addr' should contain the address in mono memory 
where the data is to be written to or read from. This address should be quad aligned (the 
bottom 2 bits are ignored). The second 8 bytes of 'addr' should contain the byte enables. The 
next 'size' bytes of ‘addr’ should contain the data to be loaded. The 'size' bytes must be 8, 
16, 32 or 64 bytes. The page loaded will normally used for a subsequent read or write 
operation. The 'size' bytes should be greater than or equal to the number of bytes which will 
be used in that operation. It is important to ensure that any previous poly stores to the 
memory location have completed. This can be done using ac.Is.sig (on any semaphore), 
followed by a sem.sync. The optional controller argument defaults to 0 (the only pio 
controller) 


Details: 

controller |addr size cycles latency temporaries | comments 

12 m2 i4 1 1 hardware 
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pio.address.put 


460 


Operands: controller, addr 


Description: 


Loads PIO node for default PIO controller, 'controller', with 8 
bytes from 'addr', which includes address enables. 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e addr has domain mono. 
e addr has width of 2. 


Notes: The first 4 bytes at address 'addr' should contain the address in mono memory 
where the data is to be written to or read from. This address should be quad aligned (the 
bottom 2 bits are ignored). The second 8 bytes of 'addr' should contain the byte enables. 
The operation of the byte enables is dependent on the PIO mode. The page loaded will 
normally be used in conjunction with pio.data.put for a subsequent write operation. It is 
important to ensure that any previous poly stores to the memory location have completed. 
This can be done using ac.ls.sig (on any semaphore), followed by a sem.sync. The optional 
controller argument defaults to 0 (the only pio controller) 


Details: 
controller |addr cycles latency temporaries | comments 
i2 m2 1 1 hardware 
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pio.data.put 


Operands: controller, addr, size 


Description: 


Puts data into PIO node for PIO controller, ‘controller’. 
Constraints: 

e controller has domain immediate. 

e controller has width of 2. 

e addr has domain mono. 

e addr has width of 2. 

e size has domain immediate. 

e size has width of 4. 


Notes: The 'size' bytes of ‘addr’ should contain the data to be loaded into the node. The 
‘size' bytes must be 8, 16, 32 or 64 bytes. This data will normally be used in conjunction with 
pio.address.put for a subsequent read or write operation. The 'size' bytes should be greater 
than or equal to the number of bytes which will be used in that operation. It is important to 
ensure that any previous poly stores to the memory location have completed. This can be 
done using ac.ls.sig (on any semaphore), followed by a sem.sync. The optional controller 
argument defaults to 0 (the only pio controller) 


Details: 

controller |addr size cycles latency temporaries | comments 

12 m2 i4 1 1 hardware 
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CSX600 Instruction Set 


pio.data.put.sig 


Operands: controller, addr, size 


Description: 


Loads PIO node for PIO controller, 'controller', with data. 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e addr has domain mono. 
e addr has width of 2. 
e size has domain immediate. 
e size has width of 4. 


Notes: The 'size' bytes at address 'addr' should contain the data to be loaded. The 'size' 
bytes must be 8, 16, 32 or 64 bytes. . The page loaded will normally be used in conjunction 
with pio.address.put for a subsequent read or write operation. The 'size' bytes should be 
greater than or equal to the number of bytes which will be used in that operation. It is 
important to ensure that any previous poly stores to the memory location have completed. 
This can be done using ac.|s.sig (on any semaphore), followed by a sem.sync. Signals 
semaphore setup in pio.putget.semaphore.put when copy has completed. The optional 
controller argument defaults to 0 (the only pio controller) 


Details: 

controller |addr size cycles latency temporaries | comments 

i2 m2 i4 1 1 hardware 
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pio.data.get.sig 


Operands: controller, addr, size 


Description: 


Stores from the PIO node on controller, 


into 


‘addr'. This is data only. 


Constraints: 


controller has domain immediate. 
controller has width of 2. 

addr has domain mono. 

addr has width of 2. 

size has domain immediate. 

size has width of 4. 


"controller' 


"size' 


bytes 


Notes: This is usually called immediately after a PIO read request to transfer data from the 
pio node to memory. This 'size' field must be 8,16,32 or 64 bytes and should be greater than 
or equal to the previous read. Signals semaphore setup in pio.putget.semaphore.put when 
copy has completed. The optional controller argument defaults to 0 (the only pio controller) 


Details: 

controller |addr size cycles latency temporaries | comments 

12 m2 i4 1 hardware 
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pio.data.get 


Operands: controller, addr, size 


Description: 


Stores from the PIO node on controller, 


into 


‘addr'. This is data only. 


Constraints: 


controller has width of 2. 
controller has domain immediate. 
addr has domain mono. 

addr has width of 2. 

size has domain immediate. 

size has width of 4. 


"controller', 


"size' 


bytes 


Notes: This is usually called immediately after a PIO read request to transfer data from the 
pio node to memory. The 'size' bytes must be 8, 16, 32 or 64 bytes. and should be greater 
than or equal to the previous read. The optional controller argument defaults to 0 (the only 
pio controller) 


Details: 

controller addr size cycles latency temporaries | comments 

i2 m2 i4 1 hardware 
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pio.sig 


Operands: controller, sem 


Description: 


Signal the semaphore indicated by the value sem after previous pioc 
or pioe operations have completed for 'controller'. 


Constraints: 


controller has width of 2. 

controller has domain immediate. 
sem has domain mono or immediate. 
sem has width of 2. 


Notes: This is normally used after a pioc.reg.get to ensure the data has been written to the 
result register. The optional controller argument defaults to 0 (the only pio controller) 


Details: 

controller |sem cycles latency temporaries | comments 

12 m2 1 1 hardware 

i2 i2 1 1 hardware 
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CSX600 Instruction Set 


466 


pio.address.offset.put 


Operands: controller, src 


Description: 


Sets th 


Constraints: 


e controller has domain immediate. 


e controller has width of 2. 


offset address on PIO controller, 


e src has domain mono or immediate or label. 
e src has width of 4. 


Notes: The address offset is a 4 byte value which will be added to the address used in the 


"controller' 


next strided or addressed read or write for each processor. After all the processors have 
completed the next transfer, the offset will be reset to zero. A single address/mask to be 


setup for multiple transfers, by modifying this value. The optional controller argument 
defaults to 0 (the only pio controller) 


Details: 

controller | src cycles latency temporaries | comments 
2 m4 macro 

i2 i14 macro 
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pio.address.offset.put 


Operands: controller, src 


Description: 


Sets the low 2 bytes of the offset address on PIO controller, 
‘controller' 


Constraints: 


e controller has domain immediate. 

e controller has width of 2. 

e src has domain mono or immediate or label. 

e src has width of 2. 
Notes: The address offset is a 2 byte value which will be added to the address used in the 
next strided or addressed read or write for each processor. After all the processors have 


completed the next transfer, the offset will be reset to zero. A single address/mask to be 


setup for multiple transfers, by modifying this value. The optional controller argument 
defaults to 0 (the only pio controller) 


Details: 

controller src cycles latency temporaries | comments 
12 m2 1 1 hardware 
i2 i12 2 2 hardware 


Document No. 06-RM-1137 Revision: 3.A 
ClearSpeed Technology plc 


467 


Instruction set description 


CSX600 Instruction Set 


pio.strided.base.put 


468 


Operands: controller, src 


Description: 


Sets the strided base address on PIO controller, 'controller' 
Constraints: 

e controller has domain immediate. 

e controller has width of 2. 

e src has domain mono or immediate or label. 

e src has width of 4. 


Notes: The strided base is the start address for a strided read or write. The optional 
controller argument defaults to 0 (the only pio controller) 


Details: 

controller |src cycles latency temporaries | comments 
12 m4 macro 

i2 i14 macro 
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pio.strided.base.update 


Operands: controller 


Description: 


Updates the strided base to be the last strided address + strided 
"size' for the PIO controller 'controller' 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 


Notes: This allows contiguous strided reads or writes and works for both strided read and 
strided write. The optional controller argument defaults to 0 (the only pio controller) 


Details: 
controller cycles latency temporaries | comments 
i2 1 1 hardware 
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pio.strided.base.increment.put 


Operands: controller, src 


Description: 


Sets the strided base to be the current base + 'src' on PIO 
controller, 'controller' 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e src has domain mono or immediate or label. 


e src has width of 2. 


Notes: The optional controller argument defaults to 0 (the only pio controller) 


Details: 

controller |src cycles latency temporaries | comments 
12 m2 1 1 hardware 
i2 i12 2 2 hardware 
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pio.strided.size.put 


Operands: controller, src 


Description: 


Sets the strided size on PIO controller, 'controller' 


Constraints: 
e controller has domain immediate. 
¢ controller has width of 2. 
e src has domain mono or immediate or label. 
e src has width of 2. 
Notes: The strided size is the number of bytes which the strided address will be 


incremented by after a poly PE has participated in the operation. The next PE will use the 
incremented address. The optional controller argument defaults to 0 (the only pio controller) 


Details: 

controller |srce cycles latency temporaries | comments 

i2 m2 1 hardware 

i2 Le 2 hardware 
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pio.putget. semaphore. put 
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Operands: controller, semaphore 


Description: 


Sets the semaphore which will be signalled when a signalling PIO put 
has completed on PIO controller, controller 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e semaphore has domain mono or immediate. 


e semaphore has width of 2. 


Notes: The semaphore will be signalled on completion after a pio.data.get.sig The optional 
controller argument defaults to 0 (the only pio controller) 


Details: 

controller semaphore | cycles latency temporaries | comments 
i2 m2 hardware 
i2 i2 hardware 
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pio.transfer.semaphore.put 


Operands: controller, semaphore 


Description: 


Sets the semaphore which will 
operation has completed. 


be signalled after a PIO transfer 


Constraints: 


e controller has domain immediate. 
e controller has width of 2. 
e semaphore has domain mono or immediate. 
e semaphore has width of 2. 
Notes: The semaphore will be signalled on completion after a PIO transfer operation which 


has indicated a signal on completion (operations with a .flush suffix e.g. 
pio.addressed.write.flush). The optional controller argument defaults to 0 (the only pio 


controller) 

Details: 

controller semaphore | cycles latency temporaries | comments 

i2 m2 hardware 

i2 i2 hardware 
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pio.addressed.write 
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Operands: controller, size 


Description: 


Performs a write on PIO controller, 'controller', using address and 
data already setup by a PIO address/data put. Writes up to 'size' 
bytes. 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e size has domain immediate. 
e size has width of 2. 


Notes: The byte enables set up in the PIO address put affect which bytes get written from 
the data section. Each bit in the byte enables corresponds with one byte in the data section, 
where the Isb of the enables refers to the first byte of the data. Only bytes that have their bit 
set will peform a write, and only pes which have at least one bit set will take part in the write. 
Note this will default to using the first PIO controller. The 'size' field must be an exact 
multiple of 8, up to 64, although the byte enables can be used to limit the number of bytes 
written. The address is offset by the the offset address setup by pio.offset.address.put. The 
address used is always quad aligned. The optional controller argument defaults to 0 (the 
only pio controller) 


Details: 
controller | size cycles latency temporaries | comments 
i2 i2 1 1 hardware 
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pio.addressed.write. flush 


Operands: controller, size 


Description: 


Performs a write on PIO controller, '‘controller' using address and 
data already setup by a PIO address/data put. Writes up to 'size' 
bytes and signals the semaphore set in the 
pio.transfer.semaphore.put when complete 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e size has domain immediate. 
e size has width of 2. 


Notes: The byte enables set up in the PIO address put affect which bytes get written from 
the data section. Each bit in the byte enables corresponds with one byte in the data section, 
where the Isb of the enables refers to the first byte of the data.Only bytes that have their bit 
set will peform a write, and only pes which have at least one bit set will take part in the write. 
Note this will default to using the first PIO controller. The 'size' field must be an exact 
multiple of 8, up to 64, although the byte enables can be used to limit the number of bytes 
written. . The address is offset by the the offset address setup by pio.offset.address.put. The 
address used is always quad aligned. The optional controller argument defaults to 0 (the 
only pio controller) 


Details: 

controller | size cycles latency temporaries | comments 

12 i2 1 1 hardware 
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pio.addressed.write.con 
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Operands: controller, size 


Description: 


Performs a consolidated write on PIO controller, 'controller' using 
address and data already setup by a PIO address/data put. Writes up 
to 'size' bytes. The PIO engine attempts to consolidate the 
requested addresses so the minumum data is transferred across the 
bus. Multiple PEs writing to the same address range get their values 
ored 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e size has domain immediate. 
e size has width of 2. 


Notes: The byte enables set up in the PIO address put affect which bytes get writen from 
the data section. Each bit in the byte enables corresponds with one byte in the data section, 
where the Isb of the enables refers to the first byte of the data. Only bytes that have their bit 
set will peform a write, and only pes which have at least one bit set will take part in the write. 
Note this will default to using the first PIO controller. Bytes written to the same address on 
different pes will have the results ored. The ‘size’ must be a 8, 16, 32 or 64 bytes. . The 
address is offset by the the offset address setup by pio.offset.address.put. The mono 
destination address used should be aligned to the size of the write, i.e. For a 32-byte write, 
a 32-byte aligned mono address should be used. The optional controller argument defaults 
to 0 (the only pio controller) 


Details: 
controller | size cycles latency temporaries | comments 
i2 12 1 1 hardware 
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pio.addressed.write.con.flush 


Operands: controller, size 


Description: 


Performs a write on the default PIO controller, 'controller' using 
address and data already setup by a PIO address/data put. Writes up 
to 'size' bytes and signals the semaphore set in the 
pio.transfer.semaphore.put when complete. 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e size has domain immediate. 
e size has width of 2. 


Notes: The byte enables set up in the PIO address put affect which bytes get written from 
the data section. Each bit in the byte enables corresponds with one byte in the data section, 
where the Isb of the enables refers to the first byte of the data. Only bytes that have their bit 
set will peform a write, and only pes which have at least one bit set will take part in the write. 
Note this will default to using the first PIO controller. Bytes written to the same address on 
different pes will have the results ored. The ‘size’ must be a 8, 16, 32 or 64 bytes. . The 
address is offset by the the offset address setup by pio.offset.address.put. The address 
used is always quad aligned. The optional controller argument defaults to 0 (the only pio 


controller) 

Details: 

controller | size cycles latency temporaries | comments 

12 i2 1 1 hardware 
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pio.addressed.read 
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Operands: controller, size 


Description: 


Performs a read of 'size' bytes using information stored by a PIO 
address/data put on PIO controller, 'controller' setup by a PIO 
address/data put. 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e size has domain immediate. 
e size has width of 2. 


Notes: The byte enables set up in the PIO address put simply act as an enable switch. If the 
4 byte value setup for the byte enables is non-zero then the PE takes part in the read, if the 
value is zero then the PE is disabled for the read. The 'size' field must be an exact multiple 
of 8, up to 64. The optional controller argument defaults to 0 (the only pio controller) 


Details: 


controller | size cycles latency temporaries | comments 


12 LD 1 1 hardware 
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pio.addressed.read.con 


Operands: controller, size 


Description: 


Performs a consolidated read of 'size' bytes using information 
stored by a PIO address put on PIO controller, 'controller' using 
pio.address.data.put. The PIO engine attempts to consolidate the 
requested addresses so the minumum data is transferred across the 
bus. 


Constraints: 
e controller has domain immediate. 
e controller has width of 2. 
e size has domain immediate. 
e size has width of 2. 


Notes: The byte enables set up by a PIO address put simply act as an enable switch. If the 
value setup for the byte enables is non-zero then the PE takes part in the read, if the value 
is zero then the PE is disabled for the read. Note this will default to using the first PIO 
controller. The 'size' must be a 8, 16, 32 or 64 bytes. This type of read is more efficient when 
many pes are reading from the same address. The address is offset by the the offset 
address setup by pio.offset.address.put. The mono source address used should be aligned 
to the size of the write, i.e. For a 32-byte write, a 32-byte aligned mono address should be 
used. The optional controller argument defaults to 0 (the only pio controller) 


Details: 

controller | size cycles latency temporaries | comments 

12 i2 1 1 hardware 
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Operands: controller, size 


Description: 


Strided write of 'size' bytes on PIO controller, 'controller', using 
address and data set up in a PIO address/data put 


Constraints: 
e controller has width of 2. 
e controller has domain immediate. 
e size has domain immediate. 
e size has width of 2. 


Notes: The byte enables set up in the PIO address put affect which bytes get written from 
the data section. Each bit in the byte enables corresponds with one byte in the data section, 
where the Isb of the enables refers to the first byte of the data. Only bytes that have their bit 
set will peform a write, and only pes which have at least one bit set will take part in the write. 
In most cases all bits should be set. The 'size' must be a multiple of 4 bytes, up to 32. Each 
poly PE will write 'size' bytes in turn (in PE id order) to the current strided address, which is 
incremented by the strided size after each poly PE participates in the write. The initial 
strided address is the base strided address plus the offset address setup by 
pio.offset.address.put. . The optional controller argument defaults to 0 (the only pio 
controller) 


Details: 
controller | size cycles latency temporaries | comments 
i2 i2 1 1 hardware 
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pio.strided.write.flush 


Operands: controller, size 


Description: 


Strided write of 'size' bytes on PIO controller, 'controller', using 
control information and data set up in a PIO address/data put, and 
Signals the semaphore set in the pio.transfer.semaphore.put when the 
write is complete. 


Constraints: 
e controller has domain immediate. 
e controller has width of 4. 
e size has domain immediate. 
e size has width of 4. 


Notes: The byte enables set up by a PIO address put simply act as an enable switch. If the 
8 byte value setup for the byte enables is non-zero then the PE takes part in the write, if the 
value is zero then the PE is disabled for the write. The 'size' must be a multiple of 4 bytes, 
up to 32. Each poly PE will write 'size' bytes in turn (in PE id order) to the current strided 
address, which is incremented by the strided size after each poly PE participates in the 
write. The initial strided address is the base strided address plus the offset address setup by 
pio.offset.address.put. The optional controller argument defaults to 0 (the only pio controller) 


Details: 

controller | size cycles latency temporaries | comments 

i4 i4 1 1 hardware 
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Operands: controller, size 


Description: 


Strided write of 'size' bytes on PIO controller, 'controller', using 
data set up in a PIO data put 


Constraints: 
e controller has width of 2. 
e controller has domain immediate. 
e size has domain immediate. 
e size has width of 2. 


Notes: All pes write all of their bytes, regardless of the bit mask. The 'size' must be a 
multiple of 8 bytes, up to 64. Each poly PE will write 'size' bytes in turn (in PE id order) to the 
current strided address, which is incremented by the strided size after each poly PE 
participates in the write. The initial strided address is the base strided address plus the 
offset address setup by pio.offset.address.put. The optional controller argument defaults to 
0 (the only pio controller) 


Details: 
controller | size cycles latency temporaries | comments 
12 i2 1 1 hardware 
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pio.strided.forced.write.flush 


Operands: controller, size 


Description: 


Strided write of 'size' bytes on PIO controller, 'controller', using 
control information and data set up in a PIO address/data put, and 
signals the semaphore set in the pio.transfer.semaphore.put when the 
write is complete. 


Constraints: 
e controller has domain immediate. 
e controller has width of 4. 
e size has domain immediate. 
e size has width of 4. 


Notes: All pes write all of their bytes, regardless of the bit mask. The 'size' must be a 
multiple of 8 bytes, up to 64. Each poly PE will write 'size' bytes in turn (in PE id order) to the 
current strided address, which is incremented by the strided size after each poly PE 
participates in the write. The initial strided address is the base strided address plus the 
offset address setup by pio.offset.address.put. . The optional controller argument defaults to 
0 (the only pio controller) 


Details: 

controller size cycles latency temporaries | comments 

i4 i4 1 1 hardware 
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pio.strided. read 


484 


Operands: controller, size 


Description: 


Strided read of 'size' bytes on PIO controller, 'controller', using 
address and data set up in a PIO address/data put. 


Constraints: 
e controller has width of 2. 
e controller has domain immediate. 
e size has width of 2. 
e size has domain immediate. 


Notes: The byte enables set up by a PIO address put simply act as an enable switch.If the 
8 byte value setup for the byte enables is non-zero then the PE takes part in the read, if the 
value is zero then the PE is disabled for the read.The 'size' field must be an exact multiple of 
8, up to 64. Each poly PE will read 'size' bytes in turn (in PE id order) to the current strided 
address, which is incremented by the strided size after each poly PE participates in the 
read. The initial strided address is the base strided address plus the offset address setup by 
pio.offset.address.put. The optional controller argument defaults to 0 (the only pio controller) 


Details: 


controller | size cycles latency temporaries | comments 


12 2 1 1 hardware 
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pio.strided.read.no.address.write.back 


Operands: controller, size 


Description: 


Strided read of 'size' bytes on PIO controller, 'controller', using 
address and data set up in a PIO address/data put, without writing 
back the final address into the node. 


Constraints: 
e controller has width of 2. 
e controller has domain immediate. 
e size has width of 2. 
e size has domain immediate. 


Notes: The byte enables set up by a PIO address put simply act as an enable switch. If the 
8 byte value setup for the byte enables is non-zero then the PE takes part in the read, if the 
value is zero then the PE is disabled for the read.The 'size' field must be an exact multiple of 
8, up to 64. Each poly PE will read 'size' bytes in turn (in PE id order) to the current strided 
address, which is incremented by the strided size after each poly PE participates in the 
read. The initial strided address is the base strided address plus the offset address setup by 
pio.offset.address.put. The optional controller argument defaults to 0 (the only pio controller) 


Details: 

controller | size cycles latency temporaries | comments 

i2 12 1 1 hardware 
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3.1.12 Semaphore instructions 


These instructions perform operations on semaphores. Macro semaphore instructions can 
be used by including semaphore.inc. 
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sem.get 
Operands: dst, sem 
Description: 


Copies the valu 


of semaphore sem into dst 


Constraints: 
e dst has domain mono. 
e dst has type unsigned. 
e dst has width of 2. 
e sem has domain mono or immediate. 


e sem has width of 2. 


Notes: This uses mutex to prevent a thread swap during its use. There may be an additional 
4 or more cycles, depending on semaphore pipeline issues. 


Details: 

dst sem cycles latency temporaries | comments 

m2u m2 macro 

m2u 12 macro 
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sem. put 


Operands: sem, src 


Description: 


Copies src into the semaphore sem 


Constraints: 
e sem has domain immediate. 
e sem has width of 2. 
e src has domain immediate. 
e src has width of 2. 


Notes: If there are any threads waiting on this semaphore, a non-zero value of src will allow 
them to complete. 


Details: 
sem src cycles latency temporaries | comments 
i2 i2 1 1 hardware 
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sem. put 


Operands: semdata 
Description: 


Copies most Significant byte of semdata into the semaphore indicated 
by the least significant byte 


Constraints: 
e semdata has domain mono. 
e semdata has width of 2. 


Notes: If there are any threads waiting on this semaphore, a non-zero value of src will allow 
them to complete 


Details: 
semdata cycles latency temporaries | comments 
m2 1 1 hardware 
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sem.sig 


Operands: sem 


Description: 


Signal the semaphore sem 


Constraints: 


e sem has domain mono or immediate. 
e sem has width of 2. 


Details: 

sem cycles latency temporaries | comments 
m2 1 1 hardware 
i2 1 1 hardware 
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sem.wait 


Operands: sem 


Description: 
Yielding wait on the semaphore sem to become non-zero 


Constraints: 
e sem has domain mono or immediate. 


e sem has width of 2. 


Notes: A yielding wait means that this thread can be interrupted by a lower priority thread 
whilst waiting for the semaphore to become non-zero 


Details: 

sem cycles latency temporaries | comments 
m2 1 1 hardware 
12 1 1 hardware 
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CSX600 Instruction Set 


sem.sync 
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Operands: sem 


Description: 


Non-yielding wait on the semaphoresem to become non-zero 


Constraints: 


e sem has domain mono or immediate. 
e sem has width of 2. 


Details: 

sem cycles latency temporaries | comments 
m2 1 1 hardware 
i2 1 1 hardware 
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3.1.13 Miscellaneous instructions 


These instructions perform various operations that do not fit into any other categegory. 
Macro Miscellaneous instructions can be used by including misc. inc. 
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nop 
Operands: 
Description: 
Performs a mono operation which does nothing 
Details: 
cycles latency temporaries | comments 
1 1 hardware 
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nop.poly 


Operands: 


Description: 


Performs a poly operation which does nothing 


Details: 
cycles latency temporaries | comments 
1 1 microcode 
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terminate 


Operands: 


Description: 


Terminates the currently running program 


Details: 


cycles latency temporaries | comments 


2 2 macro 
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penum 
Operands: dst 
Description: 
Sets dst to be the PE number of the PE 
Constraints: 
e dst has domain poly. 
e dst has type unsigned. 
Details: 
dst cycles latency temporaries | comments 
p2u 1 1 hardware 
plu 1 1 hardware 
p4u 1 1 hardware 
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thread.get 


Operands: dst 


Description: 


Sets dst to be the current thread index. 
Constraints: 
e dst has width of 2. 
e dst has domain mono. 
e dst has type unsigned. 


Details: 
dst cycles latency temporaries | comments 
m2u 1 1 hardware 
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cycles.get 


Operands: dst 


Description: 


Sets dst to be the cycle count of the processor. 
Constraints: 
e dst has width of 4. 
e dst has domain mono. 
e dst has type unsigned. 


Details: 
dst cycles latency temporaries | comments 
m4u 6 6|2 mono macro 
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CSX600 Instruction Set 


cycles.put 


500 


Operands: src 


Description: 


Sets cycle count for the processor to be src. 
Constraints: 
e src has width of 4. 
e src has domain mono or immediate. 
e src has type unsigned. 


Details: 

src cycles latency temporaries | comments 
m4u 5 5 macro 

i4u 5 5 macro 
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Instruction set description 


enable.get 


Operands: dst 


Description: 


Sets dst to be th nable state of the PE. 
of th nable stat 


Constraints: 


dst has width of 1. 

dst has domain poly. 
dst has type unsigned. 
dst has odd alignment. 


Is executed regardless 


Details: 
dst cycles latency temporaries | comments 
plu 1 2 microcode 
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CSX600 Instruction Set 


enable.put 


502 


Operands: src 


Description: 


Sets th nable state of the PE to be the value contained in src. 


Is 


xecuted regardless of th nable stat 


Constraints: 


src has width of 1. 

src has domain poly. 
src has type unsigned. 
src has odd alignment. 


Details: 
src cycles latency temporaries | comments 
plu 2 2 microcode 
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Instruction set description 


status.get 


Operands: dst 


Description: 


Sets dst to be the status of the PE. Is executed regardless of the 
enable state. 


Constraints: 


dst has width of 1. 

dst has domain poly. 
dst has type unsigned. 
dst has even alignment. 


Details: 
dst cycles latency temporaries | comments 
plu 1 2 microcode 
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status.put 


Operands: src 


Description: 


Sets the status of the PE to be the value contained in sre. Is 
xecuted regardless of th nable stat 


Constraints: 
e src has width of 1. 
e src has domain poly. 
e src has type unsigned. 


Details: 
src cycles latency temporaries | comments 
plu 2 2 microcode 
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status. fpadd.get 


Operands: dst 


Description: 


Sets dst to be the floating point add status of the PE 


Pie 


regardless of th nable stat 
Constraints: 

e dst has width of 1. 

e dst has domain poly. 

e dst has type unsigned. 


Details: 
dst cycles latency temporaries | comments 
plu 1 2 microcode 
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status. fpmul.get 


Operands: dst 


Description: 


Sets dst to be the floating point mul status of the PE. Is executed 
regardless of th nable stat 


Constraints: 
e dst has width of 1. 
e dst has domain poly. 
e dst has type unsigned. 


Details: 
dst cycles latency temporaries | comments 
plu 1 2 microcode 
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context.store 


Operands: locn, src 


Description: 


Store poly context information for thread, including src 
Constraints: 
e locn has domain immediate or label. 
e locn has width of 2. 
e src has domain poly. 
e src has maximum width of 32. 


Notes: Stores (poly) context information fora thread. This includes saving src, which 
may be of any width. It also includes the enable state, status registers and any internal 
registers of the alu. 


Details: 

locn src cycles latency temporaries | comments 

L112 p2 4 4 macro 

i12 p4 4 4 macro 

oe, p8 5 5 macro 

i12 plé 7 7 macro 

rine?) p32 11 11 macro 
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CSX600 Instruction Set 


context.load 


Operands: src, locn 


Description: 


Load poly context information for thread, including src 


Constraints: 


src has domain poly. 


src has maximum width of 32. 


locn has domain immediate or label. 


locn has width of 2. 


Notes: Restores (poly) context information fora thread. This includes src, which can be 
any width. It also includes the enable state, status registers and any internal registers of the 


alu. 

Details: 

src locn cycles latency temporaries | comments 
p2 i12 6 6 macro 

p4 i12 6 6 macro 

ps LZ 7 7 macro 

plé i12 9 9 macro 

SZ Fine?) 13 13 macro 
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break 


Operands: src 


Description: 


Triggers a breakpoint, stops only current thread 
Constraints: 
e src has domain immediate. 


Notes: src will be passed to the debugger 
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break. stop 


Operands: src 


Description: 


Triggers a breakpoint, stops all threads 
Constraints: 
e src has domain immediate. 


Notes: src will be passed to the debugger 
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dcache.invalidate 


Operands: 


Description: 


data cache invalidate 


Notes: Marks all of the cache entries as invalid. The instruction effectively executes straight 
away but the cache will take some time to actually perform the operation. 
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dcache.invalidate 


Operands: addr 


Description: 


data cache invalidate of an address 
Constraints: 
e addr has domain mono. 
e addr has width of 4. 


Notes: Marks addr as invalid if it is in the cache. The instruction effectively executes 
straight away but the cache will take some time to actually perform the operation. 
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CSX600 Instruction Set Instruction set description 


dcache.invalidate 


Operands: addr, offset 


Description: 


data cache invalidate of an address 
Constraints: 
e addr has domain mono. 
e addr has width of 4. 
¢ offset has domain mono or immediate. 
e offset has width of 2. 


Notes: Marks the address addr+offset as invalid if it is in the cache. The instruction 


effectively executes straight away but the cache will take some time to actually perform the 


operation. 
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dcache.invalidate.writeback 


Operands: 


Description: 


data cache invalidate with writeback 


Notes: Marks all of the cache entries as invalid. It forces the entries to be flushed to 
memory if it is in the cache. The instruction effectively executes straight away but the cache 
will take some time to actually perform the operation. To guarantee a dcache writeback has 
happened you need to follow it by a load using the cache, the load will be held off until the 
cache has finished writing back. 
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dcache.invalidate.writeback 


Operands: addr 


Description: 


data cache invalidate of an address with writeback 
Constraints: 
e addr has domain mono. 
e addr has width of 4. 


Notes: Marks addr as invalid if it is in the cache. It forces addr to be flushed to memory if it 
is in the cache. The instruction effectively executes straight away but the cache will take 
some time to actually perform the operation. To guarantee a dCache writeback has 
happened you need to follow it by a load using the cache, the load will be held off until the 
cache has finished writing back. 


Document No. 06-RM-1137 Revision: 3.A 515 


ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


dcache.invalidate.writeback 
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Operands: addr, offset 


Description: 


data cache invalidate of an address with writeback 


Constraints: 


addr has domain mono. 

addr has width of 4. 

offset has domain mono or immediate. 
offset has width of 2. 


Notes: Marks the address addr+toffset as invalid if it is in the cache. It forces the address 
addr+offset to be flushed to memory if it is in the cache. The instruction effectively executes 
straight away but the cache will take some time to actually perform the operation. To 
guarantee a dcache writeback has happened you need to follow it by a load using the 
cache, the load will be held off until the cache has finished writing back. 
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icache.invalidate 


Operands: 


Description: 


instruction cache invalidate 


Notes: Marks all of the cache entries as invalid. Note that instruction immediately following 
this one may not be in invalidated, depending on pipeline state. 
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icache.invalidate 


Operands: addr 


Description: 


instruction cache invalidate of address addr 
Constraints: 
e addr has domain mono. 
e addr has width of 4. 


Notes: Marks addr as invalid if it is in the cache. Note that instruction immediately following 
this one may not be in invalidated, depending on pipeline state. 
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icache.invalidate 


Operands: addr, offset 


Description: 


instruction cache invalidate of address addr + offset 
Constraints: 
e addr has domain mono. 
e addr has width of 4. 
e offset has domain mono or immediate. 
¢ offset has width of 2. 


Notes: Marks the address addrtoffset as invalid if it is in the cache. Note that instruction 
immediately following this one may not be in invalidated, depending on pipeline state. 
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mutex.start 


Operands: 


Description: 


Start of mutex section 


Notes: During a mutex section no other threads can activate. This should be used in order 
to ensure that code that is not safe thread can be executed safely. The atomic instructions 
use mutex. The mutex section is finished with a mutex.end instruction. Some macro 
instructions (e.g. mono floating point and pio.atomic.start) use mutex and this instruction 
should be used with caution. 
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mutex.ac.start 


Operands: 


Description: 


Start of mutex section, entering when the array controller is not 
full 


Notes: The array controller (ac) controls the poly elements. During a mutex section no 
other threads can activate. This should be used in order to ensure that code that is not safe 
thread can be executed safely. The atomic instructions use mutex. The mutex section is 
finished with a mutex.end instruction. Some macro instructions (e.g. mono floating point and 
pio.atomic.start) use mutex and this instruction should be used with caution. 
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mutex.ac.yield.start 


Operands: 


Description: 


Start of mutex section, entering on array controller not full, 
yielding if it is 


Notes: The array controller controls the poly elements. During a mutex section no other 
threads can activate. This should be used in order to ensure that code that is not safe thread 
can be executed safely. The atomic instructions use mutex. The mutex section is finished 
with a mutex.end instruction. Some macro instructions (e.g. mono floating point and 
pio.atomic.start) use mutex and this instruction should be used with caution. 
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mutex.pio.start 


Operands: controller 


Description: 


Start of mutex section, waiting on pio controller not being full 


Notes: During a mutex section no other threads can activate. This should be used in order 
to ensure that code that is not safe thread can be executed safely. The atomic instructions 
use mutex. The mutex section is finished with a mutex.end instruction. Some macro 
instructions (e.g. mono floating point and pio.atomic.start) use mutex and this instruction 
should be used with caution. This is used by the atomic instructions. 
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mutex.pio.yield.start 
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Operands: controller 


Description: 


Start of mutex section, waiting on pio controller not full, 
yielding if it is 


Notes: During a mutex section no other threads can activate. This should be used in order 
to ensure that code that is not safe thread can be executed safely. The atomic instructions 
use mutex. The mutex section is finished with a mutex.end instruction. Some macro 
instructions (e.g. mono floating point and pio.atomic.start) use mutex and this instruction 
should be used with caution. This is used by the atomic instructions. 
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mutex.end 


Operands: 


Description: 


End of mutex section 


Notes: This ends the current mutex instructions. Other threads may now interrupt as 
before. 
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Instruction set description 


3.2 Low level instructions 
The low level section consists of hardware and microcoded instructions and is intended 
primarily to help understand the code generated from the high level section, but is less safe 
to use directly and may change. 


3.2.1 Read and write internal register instructions 
These instructions allow reading and writing of registers internal to the mono processor 


(result registers) and array controller. 
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mono.result.get 


Operands: dst, src 


Description: 


dst = result register src 
Constraints: 
e dst has domain mono. 
e dst has width of 2. 
e src has domain immediate. 
e src has width of 1. 


Notes: The result register constants are defined in the header file result_constants.inc 


Details: 

dst src cycles latency temporaries | comments 

m2 iil 1 1 hardware 
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mono.result.put 


Operands: dst, src 


528 


Description: 


result register dst = src 


Constraints: 


e dst has domain immediate. 
e src has width of 2. 


e src has domain mono. 


Notes: The result register dst will be set to value src. The result register constants are 
defined in the header file result_constants.inc 


Details: 

dst src cycles latency temporaries | comments 
il m2 hardware 
i2 m2 hardware 
i4 m2 hardware 
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Instruction set description 


ac.1ls.reg.get 
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Operands: result, src 


Description: 


Read the array controller register using the array controller load/ 
store pipeline 


Constraints: 


e result has domain immediate. 


e result has width of 1. 


e src has domain immediate. 


e src has width of 1. 


Notes: This is read into the mono result register result. The array controller register 
constants are defined in the header file ac_constants.inc and the result register constants 
are defined in result_constants.inc 


Details: 
result src cycles latency temporaries |comments 
iil iil hardware 
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ac.1ls.reg.get.sig 


Operands: result, src, sem 
Description: 
Read the array controller register using the array controller load/ 


store pipeline, then signal the semaphore when it is stored in the 
result register. 


Constraints: 
e result has domain immediate. 
e result has width of 1. 
e src has domain immediate. 
e src has width of 1. 
e sem has domain mono or immediate. 
e sem has width of 2. 


Notes: This is read into the mono result register result. The array controller register 
constants are defined in the header file ac_constants.inc and the result register constants 
are defined in result_constants.inc. 


Details: 

result src sem cycles latency temporaries | comments 

iil il m2 1 1 hardware 

il il i2 1 1 hardware 
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ac.pe.reg.get 


Operands: result, src 


Description: 


Read the array controller register using the array controller PE 
pipeline 


Constraints: 
e result has domain immediate. 
e result has width of 1. 
e src has domain immediate. 
e src has width of 1. 


Notes: This is read into the mono result register result. The array controller register 
constants are defined in the header file ac_constants.inc and the result register constants 
are defined in result_constants.inc. 


Details: 

result src cycles latency temporaries | comments 

il iil 1 1 hardware 
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ac.pe.reg.get.sig 


Operands: result, src, sem 


Description: 


Read the array controller register using the array controller P! 


Fl 


pipeline, then signal the semaphore when it is stored in the result 
register. 


Constraints: 


result has domain immediate. 

result has width of 1. 

src has domain immediate. 

src has width of 1. 

sem has domain mono or immediate. 
sem has width of 2. 


Notes: This is read into the mono result register result. The array controller register 
constants are defined in the header file ac_constants.inc and the result register constants 
are defined in result_constants.inc. 


Details: 

result src sem cycles latency temporaries | comments 

iil il m2 1 1 hardware 

il il i2 1 1 hardware 
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Instruction set description 


ac.pe.reg.put 
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Operands: register, data 


Description: 


Write to the array controller register using the array controller 


PE pipeline 


Constraints: 


e register has domain immediate. 


e register has width of 1. 


e data has domain mono or immediate. 


e data has width of 2. 


Notes: The array controller register constants are defined in the header file 
ac_constants.inc and the result register constants are defined in result_constants.inc. 


Details: 

register data cycles latency temporaries | comments 
ail m2 hardware 
il i2 hardware 
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CSX600 Instruction Set 


ac.1ls.reg.put 
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Operands: register, data 


Description: 


Write to the array controller register using the array controller 
load/store pipeline 


Constraints: 


e register has domain immediate. 


e register has width of 1. 


e data has domain mono or immediate. 


e data has width of 2. 


Notes: The array controller register constants are defined in the header file 
ac_constants.inc and the result register constants are defined in result_constants.inc. 


Details: 

register data cycles latency temporaries | comments 
ail m2 hardware 
il i2 hardware 
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3.2.2 Arithmetic instructions 


These instructions perform low level microcoded instructions which are used by the macro 
instructions. It is advisable not to use them directly as they may change. 
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div .normalised.sdiv 


Operands: dst, src0, srcil 


Description: 


dst = (src0O << 32) / srcl 
Constraints: 

e All operands have domain poly. 

e dst has width of 4. 

e dst has type unsigned. 

e srcO has width of 4. 

e srcO has type unsigned. 

e srcl has width of 4. 

e srci has type unsigned. 
Notes: src1 must be normalised (most significant bit set). This is used ina 
div 
instruction, by first normalising the dividend, doing this instruction and then shifting the 
result to the right by the same amount as the normalisation. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4u p4u p4u 33 34 microcode 
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div.start (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


Initiate dst 


Constraints: 


= src0 / srcel 


e All operands have domain poly. 


e All operands have type float. 


e All operands have width of 8. 


e dst cannot overlap with the other operands. 


e srci has not a domain of label. 


Side Effects: 


e Leaves the status register in an undefined state. 
e Requires up to 2 levels of enable stack. 


Notes: This instruction should immediately precede a div.end instruction. It should be within 


a 
mutex.start 


/ 


mutex.end 


block for thread safety. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
pst pst pst 62 63 microcode 
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div.end (floating point 64 bit) 


Operands: dst, src0, srcil 


Description: 


End dst = srcO / srcel 
Constraints: 
e All operands have domain poly. 
e All operands have type float. 
e All operands have width of 8. 
e dst cannot overlap with the other operands. 
e srci has not a domain of label. 
Side Effects: 
e Leaves the status register in an undefined state. 
e Requires up to 2 levels of enable stack. 
Notes: This instruction should immediately succeed a div.start instruction. It should be 
within a 
mutex.start 
/ 
mutex.end 


block for thread safety. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
pst pst pst 26 27 microcode 
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3.2.3 Shift instructions 


These instructions perform low level microcoded shift instructions which are used by the 
macro instructions. It is advisable not to use them directly as they may change. 
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normalise 


Operands: dst, src0, count 


Description: 


normalise src0O, placing the number of shifts required into count 
Constraints: 
e All operands have domain poly. 
e dst has width of 4. 
e dst has type unsigned. 
e srcO has width of 4. 
e srcO has type unsigned. 
e count has width of 1. 
¢ count has type unsigned. 


Notes: This instruction may form part of the high level instruction at some point; however 
currently the main purpose is to support the div instruction 
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3.2.4 Input and Output instructions 
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pioe.reg.put 


Operands: reg, val 


Description: 


Write to a PIO controller register on default PIO controller. 
Constraints: 
e reg has domain immediate. 
e reg has width of 4. 
e val has domain mono or immediate. 
e val has width of 2. 


Notes: This defaults to the first PIO controller. The I/O register constants are defined in the 
header file pio_constants.inc 


Details: 
reg val cycles latency temporaries | comments 
i4 m2 1 1 hardware 
i4 i2 1 1 hardware 
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Instruction set description 


pioe.reg.put 


Description: 


Constraints: 


Operands: controller, reg, val 


Write to a PIO controller register on PIO controller, 


controller has domain immediate. 


controller has width of 4. 


reg has domain immediate. 


reg has width of 4. 


val has domain mono or immediate. 


val has width of 2. 


controller. 


Notes: This defaults to the first PIO controller. The I/O register constants are defined in the 
header file pio_constants.inc. 


Details: 
controller reg val cycles latency temporaries | comments 
i4 i4 m2 1 hardware 
i4 i4 i2 1 hardware 
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pioc.reg.get 


Operands: res, reg 


Description: 


read PIO controller register on default controller. 
Constraints: 
e res has domain mono or immediate. 
e res has width of 4. 
e reg has domain immediate. 
e reg has width of 4. 


Notes: This reads into the mono result register, res. It is necessary to wait for this to 
happen by doing a pioc.sig followed by a sem.sync. The I/O register constants are defined 
in the header file pio_constants.inc. 


Details: 
res reg cycles latency temporaries | comments 
i4 i4 1 1 hardware 
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pioc.reg.get 


Operands: controller, res, reg 


Description: 


Write to a PIO controller register on controller controller. 
Constraints: 
e controller has domain immediate. 
e controller has width of 4. 
e res has domain mono or immediate. 
e res has width of 2. 
e reg has domain immediate. 
e reg has width of 4. 


Notes: This reads into the mono result register, res. It is necessary to wait for this to 


happen by doing a pioc.sig followed by a sem.sync. The I/O register constants are defined 
in the header file pio_constants.inc. 


Details: 

controller res reg cycles latency temporaries | comments 

i4 m2 i4 1 1 hardware 

i4 i2 i4 1 1 hardware 
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pioe.reg.get 


Operands: res, reg 


Description: 


Read to a PIO controller register on default PIO controller. 
Constraints: 
e res has domain mono or immediate. 
e res has width of 4. 
e reg has domain immediate. 
e reg has width of 4. 


Notes: This reads into the mono result register, res. It is necessary to wait for this to 
happen by doing a pioc.sig followed by a sem.sync. The I/O register constants are defined 
in the header file pio_constants.inc. 


Details: 
res reg cycles latency temporaries | comments 
i4 i4 1 1 hardware 
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pioe.reg.get 


Operands: controller, res, reg 


Description: 


Read to a PIO controller register on PIO controller, controller. 


Constraints: 


e controller has domain immediate. 

e controller has width of 2. 

e res has domain mono or immediate. 
e res has width of 2. 

e reg has domain immediate. 

e reg has width of 4. 


Notes: This reads into the mono result register, res. It is necessary to wait for this to 


happen by doing a pioe.sig followed by a sem.sync. The I/O register constants are defined 
in the header file pio_constants.inc. 


Details: 

controller res reg cycles latency temporaries | comments 

i2 m2 i4 1 1 hardware 

i2 i2 i4 1 1 hardware 
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pioc.reg.put 


Operands: reg, val 


Description: 


Write to a PIO controller register on default controller. 
Constraints: 
e reg has domain immediate. 
e reg has width of 4. 
e val has domain mono or immediate. 
e val has width of 2. 


Notes: This defaults to the first PIO controller. The I/O register constants are defined in the 
header file pio_constants.inc. 


Details: 
reg val cycles latency temporaries | comments 
i4 m2 1 1 hardware 
i4 i2 1 1 hardware 
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Instruction set description 


pioc.reg.put 


Operands: controller, reg, val 


Description: 


Write to a PIO controller register on controller controller. 
Constraints: 
e controller has domain immediate. 
e controller has width of 4. 
e reg has domain immediate. 
e reg has width of 4. 
e val has domain mono or immediate. 
e val has width of 2. 


Notes: The I/O register constants are defined in the header file pio_constants.inc. 


Details: 

controller reg val cycles latency temporaries | comments 

i4 i4 m2 hardware 

i4 i4 i2 hardware 
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3.2.5 If instructions 
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Instruction set description 


aeo.shift 


Operands: 


Description: 


Performs all enables off across the PES, 


to the left 


shifting the previous aeo 


Notes: This produces a single bit which is 1 if all the PES are disabled and 0 otherwise. This 
value is shifted into a value stored internally in the array controller. This value can be 
accessed via a result register only after an aeo.shift.result. 


Details: 


cycles latency 


temporaries | comments 


1 microcode 
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aeo.shift.result 


552 


Operands: 


Description: 


Performs all enables off across the PES, shifting the previous aeo 
to the left and placing the result in a result register 


Notes: This produces a single bit which is 1 if all the PES are disabled and 0 otherwise. This 
value is shifted into a value stored internally in the array controller. This value can be 
accessed via a result register, which can be read using mono.result.get. Before reading the 
result register it is necessary to ensure that the value has reached the mono processor. This 
can be done with a poly.sig, followed by a sem.sync. 


Details: 


cycles latency temporaries | comments 


1 1 microcode 
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aeo.data.get 


Operands: src 


Description: 


Uses all enables off on pes to get a byte of data anded across all 
pes 


Constraints: 
e src has width of 1. 
e src has domain poly. 
e src has type unsigned. 


Notes: This produces a byte which is the anded value of src across all enabled PEs. This 
value is shifted into a value stored internally in the array controller. This value can be 
accessed via a result register, which can be read using mono.result.get. Before reading the 
result register it is necessary to ensure that the value has reached the mono processor. This 
can be done with a poly.sig, followed by a sem.sync. This is used by simple 
poly.to.mono.and 


Details: 
src cycles latency temporaries | comments 
plu 9 9 microcode 
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3.2.6 Branch instructions 


All absolute branch instructions are split into two low level instructions which set the high 
and low parts of the branch. These instructions are the same as the ones discussed at the 
high level, but start j.lo and j.hi instead of j. The j.lo based instruction must be before the j-hi 
which actually causes the jump to take place. This is currently for disassembly use only, it is 


not currently supported in the instruction set. 
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3.2.1 Semaphore instructions 


These instructions perform operations on semaphores. It is advisable not to use these 
instructions directly. 
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sem.select 


Operands: sem 


Description: 


Select the semaphore which will be waited on 


Constraints: 


e sem has domain mono or immediate. 
e sem has width of 2. 


Notes: This semaphore will be used by a subsequent sem.wait.selected or 
sem.sync.selected operation 


Details: 

sem cycles latency temporaries | comments 
m2 1 1 hardware 
12 1 1 hardware 
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sem.wait.selected 


Operands: 


Description: 


Yielding wait on the semaphore which has been selected by sem.select 


Notes: A yielding wait means that this thread can be interrupted by a lower priority thread 
whilst waiting for the semaphore to become non-zero. 
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sem.sync.selected 


Operands: 


Description: 


Non-yielding wait on the semaphore which has been selected by 
sem.select. 
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sem.get.request 


Operands: sem 


Description: 


Copies the value of semaphore sem into a result register 


Constraints: 


e sem has domain mono or immediate. 
e sem has width of 2. 


Notes: This requests a read on semaphore sem. It will arrive in the result register 
_RESULT_RES_ SEMAPHORE. It is necessary to perform a signal and then wait, before 


using the result. It is easier to use sem.get which calls this. The result register constants are 
defined in result_constants.inc. 


Details: 

sem cycles latency temporaries | comments 
m2 1 1 hardware 
i2 1 1 hardware 
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ac.pe.sig 


Operands: sem 


Description: 


Signal the semaphore sem when the ac pe pipelin 


is clear 


Constraints: 
e sem has domain mono or immediate. 


e sem has width of 2. 


Notes: This is used in conjunction with instructions such as aeo.shift to ensure that aeo has 


completed. A sem.sync may be used on the semaphore. The ac.Is.sig is described in the 
high level section. 


Details: 

sem cycles latency temporaries | comments 
m2 1 1 hardware 
i2 1 1 hardware 
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3.3 Specialized instructions 


These instructions accelerate specific functionality for applications. It includes vector 
floating point, complex mul and swazzle instructions. 


3.3.1 Vector Floating Point Instructions 


This is a list of instructions specifically intended for ClearSpeeds MTAP floating point unit. 
It exploits the internal pipelined implementation of the unit. To explain the function of each 
instruction, eight virtual registers have been used. These are MULO thru MUL3 and ACCO 
thru ACC3. Note that operands are untyped here, in fact they represent vectors of 4 or 8 


byte floats. 
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vec.mulacc.begin 


562 


Operands: mul, argl, arg2 


Description: 


Start a pipelined multiply accumulate. 


Constraints: 


All operands have domain poly. 


All operands have type float. 
mul has width of 4. 

argi has width of 4. 

argi has vector size of 2. 
arg2 has width of 4. 

arg2 has vector size of 2. 


Notes: Multiply the 2 vectors arg1 and arg2 by mul. Store result in virtual multiply registers. 


MU 
MU 
MU 
MU 


iu 0 
iu 1 
L2 


L3 


mul 
mul 
mul 
mul 


* 
* 
* 
* 
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vec.mulacc.begin 


Operands: mul, argl, arg2 


Description: 


Start a pipelined multiply accumulate. 


Constraints: 


All operands have width of 8. 


All operands have domain poly. 


All operands have type float. 


argi has vector size of 2. 


arg2 has vector size of 2. 


Notes: Multiply the 2 vectors arg1 and arg2 by mul. Store result in virtual multiply registers. 


MULO 
MUL1 
MUL2 
MUL3 


= mul 
= mul 
= mul 
= mul 


* 
* 
* 
* 


Document No. 06-RM-1137 Revision: 3.A 


563 


ClearSpeed Technology plc 


Instruction set description 
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vec.mulacc.head 


564 


Operands: mul, argl, arg2 


Description: 


Paired with vec.mulacc.begin. 


Constraints: 


All operands have domain poly. 


All operands have type float. 
argi has width of 4. 

argi has vector size of 2. 
arg2 has width of 4. 

arg2 has vector size of 2. 


Notes: Copies virtual multiply registers to virtual accumulate registers. Multiply the 2 
vectors arg1 and arg2 by mul. Store result in virtual multiply registers. 


ACCO 
ACC1 
ACC2 
ACC3 
MULO 
MULI1 
MUL2 
MUL3 


=M 


SS3RB53 22 


Ud 


¥ 
+ 
& 
# 
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vec.mulacc.head 


Operands: mul, argl, arg2 


Description: 


Paired with vec.mulacc.begin. 
Constraints: 
e All operands have domain poly. 
e All operands have width of 8. 
e All operands have type float. 
e argi has vector size of 2. 
e arg2 has vector size of 2. 


Notes: Copies virtual multiply registers to virtual accumulate registers. Multiply the 2 vectors 
arg1 and arg2 by mul. Store result in virtual multiply registers. 


ACCO = MULO + O 
ACC1 = MUL1 + 0 
ACC2 = MUL2 + 0 
ACC3 = MUL3 + O 
MULO = mul * argl[0] 
MUL1 = mul * argl[1] 
MUL2 = mul * arg2[0] 
MUL3 = mul * arg2[1] 
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vec.mulacc.step 


Operands: mul, argl, arg2 


Description: 


Iterate multiply accumulate. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e mul has width of 4. 

e argi has width of 4. 

e argi has vector size of 2. 

e arg2 has width of 4. 

e arg2 has vector size of 2. 


Notes: Accumulates virtual accumulate registers with virtual multiply registers. Multiply the 
2 vectors arg1 and arg2 by mul. Store result in virtual multiply registers. 


ACCO += MULO 
ACC1 += MUL1 
ACC2 += MUL2 
ACC3 += MUL3 
MULO = mul * argl[0] 
MUL1 = mul * argl[1] 
MUL2 = mul * arg2[0] 
MUL3 = mul * arg2[1] 
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vec.mulacc.step 


Operands: mul, argl, arg2 


Description: 


Iterate multiply accumulate. 
Constraints: 
e All operands have domain poly. 
e All operands have width of 8. 
e All operands have type float. 
e argi has vector size of 2. 
e arg2 has vector size of 2. 


Notes: Accumulates virtual accumulate registers with virtual multiply registers. Multiply the 
2 vectors arg1 and arg2 by mul. Store result in virtual multiply registers. 


ACCO += MULO 
ACC1 += MUL1 
ACC2 += MUL2 
ACC3 += MUL3 
MULO = mul * argl[0] 
MUL1 = mul * argl[1] 
MUL2 = mul * arg2[0] 
MUL3 = mul * arg2[1] 
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vec.mulacc.end 


568 


Operands: dsti1, dst2 


Description: 


Finish multiply accumulate. 


Constraints: 


e All operands have domain poly. 
e All operands have type float. 


e dsti has width of 4. 


e dsti has vector size of 2. 


e dst2 has width of 4. 


e dst2 has vector size of 2. 


Notes: Accumulates virtual accumulate registers with virtual multiply registers. Copies the 
result in vectors dst1 and dst2. 


ACCO += 


MULO 
MUL1 


= MUL2 


MUL3 

= ACCO 
= ACC1 
= ACC2 
= ACC3 
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vec.mulacc.end 


Operands: dsti, dst2 


Description: 


Finish multiply accumulate. 


Constraints: 
e All operands have domain poly. 
e All operands have width of 8. 
e All operands have type float. 
e dsti has vector size of 2. 
e dst2 has vector size of 2. 


Notes: Accumulates virtual accumulate registers with virtual multiply registers. Copies the 
result in vectors dst1 and dst2. 


ACCO += MULO 
ACC1 += MUL1 
ACC2 += MUL2 


ACC3 += MUL3 
dst1[0] = ACCO 
dst1[1] = ACC1 
dst2[0] = ACC2 
dst2[1] = ACC3 
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vec.add.begin 


Operands: argl, arg2 


Description: 


Add 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e argi has width of 4. 

e argi has vector size of 2. 

e arg2 has width of 4. 

e arg2 has vector size of 2. 


Notes: Adds 2 pairs of floating points numbers and puts them in virtual accumulate 
registers. 


ACC2 = argl[0] + arg2[0] 
ACC3 = argl[1] + arg2[1] 
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vec.add.begin 


Operands: argl, arg2 


Description: 


Add 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e All operands have width of 8. 

e argi has vector size of 2. 

e arg2 has vector size of 2. 


Notes: Adds 2 pairs of floating points numbers and puts them in virtual accumulate 
registers. 


ACC2 = argl[0] + arg2[0] 
ACC3 = argl[1] + arg2[1] 
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vec.add.tail 


Operands: dst, argl, arg2 


Description: 


Add another 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e dst has width of 4. 

e argi has width of 4. 

e argi has vector size of 2. 

e arg2 has width of 4. 

e arg2 has vector size of 2. 


Notes: Adds another 2 pairs of floating point numbers and puts them in different virtual 
accumulate registers. Copies first virtual accumulate register in dst. 


ACCO = ACC2 
ACC1 = ACC3 
ACC2 = argl[0] + arg2[0] 
ACC3 = argl[1] + arg2[1] 
dst = ACCO 
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vec.add.tail 


Operands: dst, argl, arg2 


Description: 


Add another 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have width of 8. 

e All operands have type float. 

e argi has vector size of 2. 

e arg2 has vector size of 2. 


Notes: Adds another 2 pairs of floating point numbers and puts them in different virtual 
accumulate registers. Copies first virtual accumulate register in dst. 


ACCO = ACC2 
ACC1 = ACC3 
ACC2 = argl[0] + arg2[0] 
ACC3 = argl[1] + arg2[1] 
dst = ACCO 
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vec.add.end 


Operands: dsti1, dst2, dst3 


Description: 


Extract add/subtract results. 
Constraints: 
e All operands have domain poly. 
e All operands have type float. 
e dsti has width of 4. 
e dst2 has width of 4. 
e dst3 has width of 4. 
Notes: Copies last 3 virtual accumulate registers in destination registers. 


dstl = ACC1 
dst2 = ACC2 
dst3 = ACC3 


Also frequently used to end vector subtract. 
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vec.add.end 


Operands: dsti1, dst2, dst3 


Description: 


Extract add/subtract results. 
Constraints: 
e All operands have domain poly. 
e All operands have width of 8. 
e All operands have type float. 
Notes: Copies last 3 virtual accumulate registers in destination registers. 


dstl = ACC1 
dst2 ACC2 
dst3 = ACC3 


Also frequently used to end vector subtract. 
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vec.sub.begin 


Operands: argl, arg2 


Description: 


Subtract 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e argi has width of 4. 

e argi has vector size of 2. 

e arg2 has width of 4. 

e arg2 has vector size of 2. 


Notes: Subtracts 2 pairs of floating points numbers and puts them in virtual accumulate 


registers. 
ACC2 = argl[0] - arg2[0] 
ACC3 = argl[1] - arg2[1] 
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vec.sub.begin 


Operands: argl, arg2 


Description: 


Subtract 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e All operands have width of 8. 

e argi has vector size of 2. 

e arg2 has vector size of 2. 


Notes: Subtracts 2 pairs of floating points numbers and puts them in virtual accumulate 


registers. 
ACC2 = argl[0] - arg2[0] 
ACC3 = argl[1] - arg2[1] 
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vec.sub.tail 


Operands: dst, argl, arg2 


Description: 


Subtracts another 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e dst has width of 4. 

e argi has width of 4. 

e argi has vector size of 2. 

e arg2 has width of 4. 

e arg2 has vector size of 2. 


Notes: Subtracts another 2 pairs of floating point numbers and puts them in different virtual 
accumulate registers. Copies first virtual accumulate register in dst. 


ACCO = ACC2 
ACC1 = ACC3 
ACC2 = argl[0] - arg2[0] 
ACC3 = argl[1] - arg2[1] 
dst = ACCO 
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vec.sub.tail 


Operands: dst, argl, arg2 


Description: 


Subtracts another 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have width of 8. 

e All operands have type float. 

e argi has vector size of 2. 

e arg2 has vector size of 2. 


Notes: Subtracts another 2 pairs of floating point numbers and puts them in different virtual 
accumulate registers. Copies first virtual accumulate register in dst. 


ACCO = ACC2 
ACC1 = ACC3 
ACC2 = argl[0] - arg2[0] 
ACC3 = argl[1] - arg2[1] 
dst = ACCO 


Document No. 06-RM-1137 Revision: 3.A 579 
ClearSpeed Technology plc 


Instruction set description CSX600 Instruction Set 


vec.mul.begin 


Operands: argl, arg2 


Description: 


Multiplies 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e argi has width of 4. 

e argi has vector size of 2. 

e arg2 has width of 4. 

e arg2 has vector size of 2. 


Notes: Multiplies 2 pairs of floating points numbers and puts them in virtual multiply 
registers. 


MUL2 = argl[0] * arg2[0] 
MUL3 = argl[1] * arg2[1] 
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CSX600 Instruction Set Instruction set description 


vec.mul.begin 


Operands: argl, arg2 


Description: 


Multiplies 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e All operands have width of 8. 

e argi has vector size of 2. 

e arg2 has vector size of 2. 


Notes: Multiplies 2 pairs of floating points numbers and puts them in virtual multiply 
registers. 


MUL2 = argl[0] * arg2[0] 
MUL3 = argl[1] * arg2[1] 
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Instruction set description CSX600 Instruction Set 


vec.mul.tail 


Operands: dst, argl, arg2 


Description: 


Multiplies another 2 pairs of floating point numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have type float. 

e dst has width of 4. 

e argi has width of 4. 

e argi has vector size of 2. 

e arg2 has width of 4. 

e arg2 has vector size of 2. 


Notes: Multiplies another 2 pairs of floating point numbers and puts them in different virtual 
multiply registers. Copies first virtual multiply register in dst. 


MULO = MUL2 
MUL1 = MUL3 
MUL2 = argl[0] * arg2[0] 
MUL3 = argl[1] * arg2[1] 
dst = MULO 
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CSX600 Instruction Set 


Instruction set description 


vec.mul.tail 


Operands: dst, argl, arg2 


Description: 


Multiplies another 2 pairs of floating point numbers. 


Constraints: 


All operands have domain poly. 


All operands have type float. 


All operands have width of 8. 


argi has vector size of 2. 


arg2 has vector size of 2. 


Notes: Multiplies another 2 pairs of floating point numbers and puts them in different virtual 


multiply registers. Copies first virtual multiply register in dst. 


MULO 
MUL1 
MUL2 
MUL3 
dst 


MU 
MU 


MU 


L2 
L3 
argl1[0] 
argl[1] 


LO 


* arg2[0] 
* arg2[1] 
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Instruction set description CSX600 Instruction Set 


vec.mul.end 


Operands: dsti, dst2, dst3 


Description: 


Extract multiply results. 
Constraints: 

e All operands have domain poly. 
e All operands have type float. 

e dsti has width of 4. 

e dst2 has width of 4. 

e dst3 has width of 4. 


Notes: Copies last 3 virtual multiply registers in destination registers. 


dstl = MUL1 
dst2 = MUL2 
dst3 = MUL3 
584 Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


vec.mul.end 


Operands: dsti1, dst2, dst3 


Description: 


Extract multiply results. 
Constraints: 
e All operands have domain poly. 
e All operands have type float. 
e All operands have width of 8. 
Notes: Copies last 3 virtual multiply registers in destination registers. 


dstl = MUL1 
dst2 = MUL2 
dst3 = MUL3 
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Instruction set description CSX600 Instruction Set 


complex .mul 


Operands: dst, arg0, argl 


Description: 


Multiplies two complex numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have vector size of 2. 

e All operands have type float. 

e dst has width of 4. 

e dst has vector size of 2. 

e argO has width of 4. 

e argO has vector size of 2. 

e argi has width of 4. 

e argi has vector size of 2. 


Notes: 
dst[0] = (argO[0] * argl[0]) - (argO[1] * argl[1]); 
dst[1] = (argO[0] * argl[1]) + (argO[1] * arg1[0]); 
Details: 
dst argO arg1 cycles latency temporaries | comments 
p4f p4ft p4f 11 12 extension 
microcode 
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CSX600 Instruction Set Instruction set description 


complex .mul 


Operands: dst, arg0, argl 


Description: 


Multiplies two complex numbers. 
Constraints: 

e All operands have domain poly. 

e All operands have vector size of 2. 

e All operands have type float. 

e All operands have width of 8. 

e dst has vector size of 2. 

e argO has vector size of 2. 

e argi has vector size of 2. 


Notes: 

dst[0] = (argO[0] * argl[0]) - (argO[1] * argl[1]); 

dst[1] = (argO[0] * argl[1]) + (argO[1] * arg1[0]); 
Details: 
dst argO arg1 cycles latency temporaries | comments 
pst psf pst 11 12 extension 

microcode 
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Instruction set description CSX600 Instruction Set 


3.4 Vector instructions 


These instructions expose the vector capabilities of the hardware. Both 32-bit and 64-bit 
floating point numbers are supported 


3.4.1 Vector instructions 


These instructions perform arithmetic operations on vector types 
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CSX600 Instruction Set Instruction set description 


vector.add (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO + srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction adds each sub-element of the specified vector in srcO with the sub- 
element of the vector src1 and leaves the results in the accumulator while setting dst as the 
flush destination. Any other instruction that requires the accumulator to be reused or 
attempts to use dst as an input will result in a flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4ft p4t p4f 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


vector.add (floating point 64-bit) 


Operands: dst, src0, srcl 


Description: 


dst = srcO + srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction adds each sub-element of the specified vector in srcO with the sub- 
element of the vector src1 and leaves the results in the accumulator while setting dst as the 
flush destination. Any other instruction that requires the accumulator to be reused or 
attempts to use dst as an input will result in a flush to the destination registers. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
p8ft pst psf 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


vector.add.scalar (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO + srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction adds each sub-element of the specified vector in srcO with src1 and 
leaves the results in the accumulator while setting dst as the flush destination. Any other 
instruction that requires the accumulator to be reused or attempts to use dst as an input will 
result in a flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4ft p4t p4f 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


vector.add.scalar (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO + srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction adds each sub-element of the specified vector in srcO with src1 and 
leaves the results in the accumulator while setting dst as the flush destination. Any other 
instruction that requires the accumulator to be reused or attempts to use dst as an input will 
result in a flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
psf pst psf 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


vector.sub (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO - srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction subtracts each sub-element of the specified vector in src1 from each 
sub-element of the vector srcO and leaves the results in the accumulator while setting dst as 
the flush destination. Any other instruction that requires the accumulator to be reused or 
attempts to use dst as an input will result in a flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4ft p4t p4t 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


vector.sub (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO - srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction subtracts each sub-element of the specified vector in src1 from each 
sub-element of the vector srcO and leaves the results in the accumulator while setting dst as 
the flush destination. Any other instruction that requires the accumulator to be reused or 
attempts to use dst as an input will result in a flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p8ft pst psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


vector.sub.scalar (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst 


= srcO - srcl. 


Constraints: 


All operands have type float. 
dst has vector size of 4. 

dst has width of 4. 

srcO has vector size of 4. 
srcO has width of 4. 


Notes: This instruction subtracts src1 from each sub-element of the vector srcO and leaves 
the results in the accumulator while setting dst as the flush destination. Any other instruction 
that requires the accumulator to be reused or attempts to use dst as an input will result in a 
flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4ft p4t p4f 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


vector.sub.scalar (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst = srcO - srcl. 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction subtracts src1 from each sub-element of the vector srcO and leaves 
the results in the accumulator while setting dst as the flush destination. Any other instruction 
that requires the accumulator to be reused or attempts to use dst as an input will result in a 
flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
psf pst psf 4 4 microcode 
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CSX600 Instruction Set 


Instruction set description 


vector.displace (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst = 


Constraints: 


srcO - srcl 


e All operands have type float. 


e All operands have width of 4. 


e dst has vector size of 4. 


e dst has width of 4. 


e srci has vector size of 4. 


e srci has width of 4. 


Notes: This instruction subtracts each sub-element of the vector src1 from srcO and leaves 
the results in the accumulator while setting dst as the flush destination. Any other instruction 
that requires the accumulator to be reused or attempts to use dst as an input will result in a 
flush to the destination registers. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
p4ft p4t p4t 4 microcode 
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Instruction set description CSX600 Instruction Set 


vector.displace (floating point 64-bit) 


Operands: dst, src0, srcl 


Description: 


dst = srcO - srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction subtracts each sub-element of the vector src1 from srcO and leaves 
the results in the accumulator while setting dst as the flush destination. Any other instruction 
that requires the accumulator to be reused or attempts to use dst as an input will result in a 
flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
psf pst psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


vector.mul (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst = src0O * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with each subelement of src1 
and leaves the results in the multiplier while setting dst as the flush destination. Any other 
instruction that requires the multiplier to be reused or attempts to use dst as an input will 
result in a flush to the destination registers. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
p4ft p4t p4f 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


vector.mul (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst = src0O * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with each subelement of src1 
and leaves the results in the multiplier while setting dst as the flush destination. Any other 
instruction that requires the multiplier to be reused or attempts to use dst as an input will 
result in a flush to the destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p8ft pst psf 4 5 microcode 

600 Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


vector.mul.scalar (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst = src0O * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e dst has width of 4. 
e srcO has vector size of 4. 
e srcO has width of 4. 


Notes: This instruction multiplies each subelement of srcO with src1 and leaves the results 
in the multiplier while setting dst as the flush destination. Any other instruction that requires 
the multiplier to be reused or attempts to use dst as an input will result in a flush to the 
destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


vector.mul.scalar (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst = src0O * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with src1 and leaves the results 
in the multiplier while setting dst as the flush destination. Any other instruction that requires 
the multiplier to be reused or attempts to use dst as an input will result in a flush to the 
destination registers. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
psf pst psf 4 5 microcode 
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CSX600 Instruction Set 


Instruction set description 


vector.mulacc (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst += src0O * srcl 


Constraints: 


e All operands have type float. 


e All operands have width of 4. 


e dst has vector size of 4. 


e srcO has vector size of 4. 


e srci has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with each subelement of src1. 
Then the result of this multiply is accumulated with dst and left in the accumulator. A 

subsequent call to mulacc with the same dst will result in just a multiply of the two inputs, 
with further calls doing an accumulate and multiply. Any other instructions using dst as an 
input, or the vector unit with a different dst will result in a flush of the vector unit. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
p4t p4t p4t macro 
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Instruction set description CSX600 Instruction Set 


vector.mulacc (floating point 64-bit) 


Operands: dst, src0, srcl 


Description: 


dst += src0O * srcl 


Constraints: 


All operands have type float. 
All operands have width of 8. 
dst has vector size of 4. 

srcO has vector size of 4. 
srci has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with each subelement of src1. 
Then the result of this multiply is accumulated with dst and left in the accumulator. A 
subsequent call to mulacc with the same dst will result in just a multiply of the two inputs, 
with further calls doing an accumulate and multiply. Any other instructions using dst as an 
input, or the vector unit with a different dst will result in a flush of the vector unit. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
psf pst psf 8 8 macro 
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CSX600 Instruction Set Instruction set description 


vector.mulacc.scalar (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst += src0O * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with src1. Then the result of this 
multiply is accumulated with dst and left in the accumulator. A subsequent call to mulacc 
with the same dst will result in just a multiply of the two inputs, with further calls doing an 
accumulate and multiply. Any other instructions using dst as an input, or the vector unit with 
a different dst will result in a flush of the vector unit. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4ft p4t p4t 8 8 macro 
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Instruction set description CSX600 Instruction Set 


vector.mulacc.scalar (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst += srcO * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with src1. Then the result of this 
multiply is accumulated with dst and left in the accumulator. A subsequent call to mulacc 
with the same dst will result in just a multiply of the two inputs, with further calls doing an 
accumulate and multiply. Any other instructions using dst as an input, or the vector unit with 
a different dst will result in a flush of the vector unit. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
psf pst pst 8 8 macro 
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CSX600 Instruction Set 


Instruction set description 


vector.mulnegacc (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst -= srcO * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with each sub-element of src1. 
Then the result of this multiply is subtracted from dst and left in the accumulator. A 
subsequent call to mulnegacc with the same dst will result in just a multiply of the two inputs, 
with further calls doing a subtract and multiply. Any other instructions using dst as an input, 
or the vector unit with a different dst will result in a flush of the vector unit. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
p4ft p4t p4t 32 33] 4 poly macro 
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Instruction set description 


CSX600 Instruction Set 


vector.mulnegacc (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst -= srcO * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with each sub-element of src1. 
Then the result of this multiply is subtracted from dst and left in the accumulator. A 
subsequent call to mulnegacc with the same dst will result in just a multiply of the two inputs, 
with further calls doing a subtract and multiply. Any other instructions using dst as an input, 
or the vector unit with a different dst will result in a flush of the vector unit. 


Details: 

dst srcO src1 cycles latency temporaries | comments 
pst pst pst 32 33] 8 poly macro 
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CSX600 Instruction Set 


Instruction set description 


vector.mulnegacc.scalar (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst -= srcO * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with src1. Then the result of this 
multiply is subtracted from dst and left in the accumulator. A subsequent call to mulnegacc 
with the same dst will result in just a multiply of the two inputs, with further calls doing a 
subtract and multiply. Any other instructions using dst as an input, or the vector unit with a 
different dst will result in a flush of the vector unit. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4ft p4t p4t 32 33) 4 poly macro 
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Instruction set description 


CSX600 Instruction Set 


vector.mulnegacc.scalar (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst -= srcO * srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction multiplies each subelement of srcO with src1. Then the result of this 
multiply is subtracted from dst and left in the accumulator. A subsequent call to mulnegacc 
with the same dst will result in just a multiply of the two inputs, with further calls doing a 
subtract and multiply. Any other instructions using dst as an input, or the vector unit with a 
different dst will result in a flush of the vector unit. 


Details: 

dst src0 src1 cycles latency temporaries | comments 
psf pst pst 32 33] 8 poly macro 
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CSX600 Instruction Set Instruction set description 


vector.addmul (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst *= srcO + srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction adds each sub-element of srcO to each sub-element of src1 and 
multiplies the results to dst and writing the result out to dst 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4f p4f p4f 12 13 macro 
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Instruction set description 


CSX600 Instruction Set 


vector.addmul (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst *= src0 + srcl 


Constraints: 


All operands have type float. 
All operands have width of 8. 
dst has vector size of 4. 

srcO has vector size of 4. 
srci has vector size of 4. 


Notes: This instruction adds each sub-element of srcO to each sub-element of src1 and 
multiplies the results to dst and writing the result out to dst 


Details: 

dst srcO src1 cycles latency temporaries | comments 
pst pst pst 12 13 macro 
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CSX600 Instruction Set Instruction set description 


vector.addmul.scalar (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst *= srcO + srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction adds each sub-element of srcO to src1 and multiplies the results to 
dst and writing the result out to dst 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4f p4ft p4f 12 13 macro 
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Instruction set description CSX600 Instruction Set 


vector.addmul.scalar (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst *= srcO + srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction adds each sub-element of srcO to src1 and multiplies the results to 
dst and writing the result out to dst 


Details: 

dst src0 src1 cycles latency temporaries | comments 
pst pst pst 12 13 macro 
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CSX600 Instruction Set Instruction set description 


vector.submul (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst *= srcO - srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction subtracts each sub-element of srcO from each sub-element of src1 
and multiplies the results to dst and writing the result out to dst 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4f p4ft p4f 16 17) 16 poly macro 
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Instruction set description CSX600 Instruction Set 


vector.submul (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst *= srcO - srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 
e srci has vector size of 4. 


Notes: This instruction subtracts each sub-element of srcO from each sub-element of src1 
and multiplies the results to dst and writing the result out to dst 


Details: 

dst srcO src1 cycles latency temporaries | comments 
psf pst p8f 16 17) 32 poly macro 
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CSX600 Instruction Set Instruction set description 


vector.submul.scalar (floating point 32-bit) 


Operands: dst, src0, srcil 


Description: 


dst *= srcO - srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction subtracts each sub-element of src1 from srcO and multiplies the 
results to dst and writing the result out to dst 


Details: 

dst src0 src1 cycles latency temporaries | comments 
p4f p4f p4f 16 17) 16 poly macro 
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Instruction set description CSX600 Instruction Set 


vector.submul.scalar (floating point 64-bit) 


Operands: dst, src0, srcil 


Description: 


dst *= srcO - srcl 
Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e srcO has vector size of 4. 


Notes: This instruction subtracts each sub-element of src1 from srcO and multiplies the 
results to dst and writing the result out to dst 


Details: 

dst src0 src1 cycles latency temporaries | comments 
psf pst jee 16 17) 32 poly macro 
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CSX600 Instruction Set Instruction set description 


vector.neg (floating point 32-bit) 


Operands: dst, src 


Description: 


dst = ~src 

Constraints: 
e All operands have type float. 
e All operands have width of 4. 
e dst has vector size of 4. 
e src has vector size of 4. 


Notes: This instruction negates each subelement of src and stores the results in dst 


Details: 

dst src cycles latency temporaries | comments 

p4ft p4f 7 8 microcode 
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Instruction set description CSX600 Instruction Set 


vector.neg (floating point 64-bit) 


Operands: dst, src 


Description: 


dst = ~src 

Constraints: 
e All operands have type float. 
e All operands have width of 8. 
e dst has vector size of 4. 
e src has vector size of 4. 


Notes: This instruction negates each subelement of src and stores the results in dst 


Details: 
dst src cycles latency temporaries | comments 
psf psf 7 8 microcode 
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CSX600 Instruction Set Instruction set description 


vector.cast (vector 32-bit integer to vector 32-bit float) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type integer. 
e src has width of 4. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 

dst src cycles latency temporaries | comments 

p4f p4us 8 9 macro 
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Instruction set description CSX600 Instruction Set 


vector.cast (vector 16-bit integer to vector 32-bit float) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type integer. 
e src has width of 2. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 
dst src cycles latency temporaries | comments 
p4f p2us 8 9 macro 
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CSX600 Instruction Set Instruction set description 


vector.cast (vector 32-bit float to vector 32-bit int) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type integer. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 

dst src cycles latency temporaries | comments 

p4us p4f 8 9 macro 
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Instruction set description CSX600 Instruction Set 


vector.cast (vector 32-bit float to vector 16-bit int) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type integer. 
e dst has width of 2. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 
dst src cycles latency temporaries | comments 
p2us p4f 8 9 macro 
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CSX600 Instruction Set Instruction set description 


vector.cast (vector 32-bit integer to vector 64-bit float) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has vector size of 4. 
e src has type integer. 
e src has width of 4. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 

dst src cycles latency temporaries | comments 

psf p4us 8 9 macro 
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Instruction set description CSX600 Instruction Set 


vector.cast (vector 16-bit integer to vector 64-bit float) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has vector size of 4. 
e src has type integer. 
e src has width of 2. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 
dst src cycles latency temporaries | comments 
psf p2us 8 9 macro 
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CSX600 Instruction Set Instruction set description 


vector.cast (vector 64-bit float to vector 32-bit int) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type integer. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 8. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 

dst src cycles latency temporaries | comments 

p4us pst 8 9 macro 
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Instruction set description CSX600 Instruction Set 


vector.cast (vector 64-bit float to vector 16-bit int) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type integer. 
e dst has width of 2. 
e src has vector size of 4. 
e src has width of 8. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 
dst src cycles latency temporaries | comments 
p2us psf 8 9 macro 
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CSX600 Instruction Set Instruction set description 


vector.cast (vector 32-bit float to vector 64-bit float) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 

dst src cycles latency temporaries | comments 

psf p4f 8 9 macro 
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Instruction set description CSX600 Instruction Set 


vector.cast (vector 64-bit float to vector 32-bit float) 


Operands: dst, src 


Description: 


dst = src 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 8. 


Notes: This instruction casts each subelement of src to the specified type in dst and stores 
the result in dst 


Details: 
dst src cycles latency temporaries | comments 
p4f pst 8 9 macro 
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CSX600 Instruction Set Instruction set description 


vector.reduce.add (32-bit floating point add reduction) 


Operands: dst, src 


Description: 


dst = src[0] + src[l1] + src[2] + src[3] 
Constraints: 

e All operands have domain poly. 

e dst has type float. 

e dst has width of 4. 

e src has vector size of 4. 

e src has type float. 

e src has width of 4. 


Notes: Adds the four elements in the specified vector together and assigns the result to the 


destination 

Details: 

dst src cycles latency temporaries | comments 

p4ft p4ft 9 10 microcode 
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Instruction set description CSX600 Instruction Set 


vector.reduce.add (64-bit floating point add reduction) 


Operands: dst, src 


Description: 


dst = src[0] + src[l1] + src[2] + src[3] 
Constraints: 

e All operands have domain poly. 

e dst has type float. 

e dst has width of 8. 

e src has vector size of 4. 

e src has type float. 

e src has width of 8. 


Notes: Adds the four elements in the specified vector together and assigns the result to the 


destination 
Details: 
dst src cycles latency temporaries | comments 
psf psf 9 10 microcode 
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CSX600 Instruction Set Instruction set description 


vector.reduce.mul (32-bit floating point multiply reduction) 


Operands: dst, srcl, sr2 


Description: 


dst = src[0] * src[l] * src[2] * src[3] 
Constraints: 

e All operands have domain poly. 

e dst has type float. 

e dst has width of 4. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e sr2 has vector size of 4. 

e sr2 has type float. 

e sr2 has width of 4. 


Notes: Multiplies the four elements in the specified vector together and assigns the result to 
the destination 


Details: 

dst src1 sr2 cycles latency temporaries | comments 
p4f p4ft p4f 15 16 macro 
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Instruction set description CSX600 Instruction Set 


vector.reduce.mul (64-bit floating point multiply reduction) 


Operands: dst, srcl, src2 


Description: 


dst = src[0] * src[l1] * sre[2] * src[3] 
Constraints: 

e All operands have domain poly. 

e dst has type float. 

e dst has width of 8. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Multiplies the four elements in the specified vector together and assigns the result to 
the destination 


Details: 

dst src1 src2 cycles latency temporaries | comments 
psf pst pst 15 16 macro 
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CSX600 Instruction Set Instruction set description 


_vec.mul (32-bit floating point multiply operation) 


Operands: srcl, src2 


Description: 


fomul = srcl * src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Multiplies the four elements in the specified vectors together, leaving the results in 
the floating point multiply pipeline for either recirculating or flushing to the register file. 


Details: 

sre1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.mul (64-bit floating point multiply operation) 


Operands: srcl, src2 


Description: 


fomul = srcl * src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Multiplies the four elements in the specified vectors together, leaving the results in 
the floating point multiply pipeline for either recirculating or flushing to the register file. 


Details: 
src1 src2 cycles latency temporaries | comments 
pst psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.add (32-bit floating point add operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl + src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Adds the four elements in the specified vectors together, leaving the results in the 
floating point add pipeline for either recirculating or flushing to the register file. 


Details: 

src1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.add (64-bit floating point add operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl + src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Adds the four elements in the specified vectors together, leaving the results in the 
floating point add pipeline for either recirculating or flushing to the register file. 


Details: 
src1 src2 cycles latency temporaries | comments 
pst psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.sub (32-bit floating point subtract operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl - src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Subtracts the four elements in the src2 from src1, leaving the results in the floating 
point add pipeline for either recirculating or flushing to the register file. 


Details: 

src1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.sub (64-bit floating point subtract operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl - src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Subtracts the four elements in the src2 from src1, leaving the results in the floating 
point add pipeline for either recirculating or flushing to the register file. 


Details: 
src1 src2 cycles latency temporaries | comments 
pst psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.acc (32-bit floating point accumulate operation) 


Operands: src 


Description: 


fpadd = src + fpmul 
Constraints: 
e All operands have domain poly. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Adds the four elements in the src to the result of the multiply, stepping the multiple 
pipe each cycle, leaving the results in the floating point add pipeline for either recirculating 
or flushing to the register file. 


Details: 
src cycles latency temporaries | comments 
p4ft 4 4 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.acc (64-bit floating point accumulate operation) 


642 


Operands: src 


Description: 


fpadd = src 


Constraints: 


+ fpmul 


e All operands have domain poly. 


e src has vector size of 4. 


e src has type float. 
e src has width of 8. 


Notes: Adds the four elements in the src to the result of the multiply, stepping the multiple 
pipe each cycle, leaving the results in the floating point add pipeline for either recirculating 
or flushing to the register file. 


Details: 
src cycles latency temporaries | comments 
pst 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.aflush (32-bit floating point adder flush function) 


Operands: dst 


Description: 


dst = fpadd 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 


Notes: Flushes the values held in the floating point adder pipeline out to the register 
location indicated in dst 


Details: 
dst cycles latency temporaries | comments 
p4ft 4 5 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.aflush (64-bit floating point adder flush function) 


644 


Operands: dst 


Description: 


dst = fpadd 


Constraints: 


e All operands have domain poly. 


e dst has vector size of 4. 


e dst has type float. 
e dst has width of 8. 


Notes: Flushes the values held in the floating point adder pipeline out to the register 


location indicated in dst 


Details: 
dst cycles latency temporaries | comments 
pst 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.mflush (32-bit floating point multiply flush function) 


Operands: dst 


Description: 


dst = fpmul 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 


Notes: Flushes the values held in the floating point multiply pipeline out to the register 
location indicated in dst 


Details: 
dst cycles latency temporaries | comments 
p4ft 4 5 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.mflush (64-bit floating point multiply flush function) 


646 


Operands: dst 


Description: 


dst = fpmul 


Constraints: 


e All operands have domain poly. 


e dst has vector size of 4. 


e dst has type float. 
e dst has width of 8. 


Notes: Flushes the values held in the floating point multiply pipeline out to the register 


location indicated in dst 


Details: 
dst cycles latency temporaries | comments 
psf 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.maflush (32-bit floating point multiply and accumulate 
flush function) 


Operands: dst 


Description: 


dst = fpadd + fpmul 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 


Notes: Accumulates the floating point multiply into the adder pipeline, then flushes the 
adder pipeline to dst 


Details: 
dst cycles latency temporaries | comments 
p4ft 8 9 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.maflush (64-bit floating point multiply and accumulate 
flush function) 


648 


Operands: dst 


Description: 


dst = fpadd + fpmul 


Constraints: 


e All operands have domain poly. 


e dst has vector size of 4. 


e dst has type float. 
e dst has width of 8. 


Notes: Accumulates the floating point multiply into the adder pipeline, then flushes the 
adder pipeline to dst 


Details: 
dst cycles latency temporaries | comments 
pst 8 9 microcode 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


_vec.macc (32-bit floating point multiply accumulate operation) 


Operands: srcl, src2 
Description: 
fpadd = fpmul + fpadd 
fpmul = srcel * src2 

Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Accumulates the result of the multiple pipe in to the accumulator, then multiplies src1 


by src2 

Details: 

src1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.macc (64-bit floating point multiply accumulate operation) 


Operands: srcl, src2 
Description: 
fpadd = fpmul + fpadd 
fpmul = srcel * src2 

Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srcl has width of 8. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Accumulates the result of the multiple pipe in to the accumulator, then multiplies src1 


by src2 
Details: 
src1 src2 cycles latency temporaries | comments 
psf psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.macc (32-bit floating point multiply accumulate operation) 


Operands: srcl, src2 
Description: 
fpadd = fpmul + fpadd 
fpmul = srcel * src2 

Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srcl has width of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Accumulates the result of the multiple pipe in to the accumulator, then multiplies src1 


by src2 

Details: 

srce1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.macc (64-bit floating point multiply accumulate operation) 


Operands: srcl, src2 
Description: 
fpadd = fpmul + fpadd 
fpmul = srcel * src2 

Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Accumulates the result of the multiple pipe in to the accumulator, then multiplies src1 


by src2 
Details: 
srce1 src2 cycles latency temporaries | comments 
psf psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.macc.postmul (32-bit floating point multiply accumulate 
operation) 


Operands: srcl, src2 
Description: 
fpadd = fpmul + 0 
fpmul = srel * src2 

Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Moves the result of the multiply pipe in to the accumulator by accumulating with 
zero, then multiplies src1 by src2 


Details: 

srce1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.macc.postmul (64-bit floating point multiply accumulate 
operation) 


654 


Operands: srcl, src2 


Description: 


fpadd = fpmul + 0 


fomul = srcl * src2 


Constraints: 


All operands have domain poly. 
srcl has vector size of 4. 

srci has type float. 

srci has width of 8. 

src2 has vector size of 4. 

src2 has type float. 

src2 has width of 8. 


Notes: Moves the result of the multiply pipe in to the accumulator by accumulating with 
zero, then multiplies src1 by src2 


Details: 
srce1 src2 cycles latency temporaries | comments 
pst pst 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.macc.postmul (32-bit floating point multiply accumulate 
operation) 


Operands: srcl, src2 
Description: 
fpadd = fpmul + fpadd 
fpmul = srcel * src2 

Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Moves the result of the multiply pipe in to the accumulator by accumulating with 
zero, then multiplies src1 by src2 


Details: 

src1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.macc.postmul (64-bit floating point multiply accumulate 
operation) 


Operands: srcl, src2 
Description: 
fpadd = fpmul + fpadd 
fpmul = srcel * src2 

Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Moves the result of the multiply pipe in to the accumulator by accumulating with 
zero, then multiplies src1 by src2 


Details: 
src1 src2 cycles latency temporaries | comments 
psf psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.scale (32-bit floating point multiply operation) 


Operands: srcl, src2 


Description: 


fomul = srcl * src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Multiplies the four elements in the specified vector together with a scalar, leaving the 
results in the floating point multiply pipeline for either recirculating or flushing to the register 


file. 

Details: 

srce1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.scale (64-bit floating point multiply operation) 


Operands: srcl, src2 


Description: 


fomul = srcl * src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Multiplies the four elements in the specified vector together with a scalar, leaving the 
results in the floating point multiply pipeline for either recirculating or flushing to the register 


file. 
Details: 
src1 srce2 cycles latency temporaries | comments 
psf psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.offset (32-bit floating point add operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl + src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Adds the four elements in the specified vector together with a scalar, leaving the 
results in the floating point add pipeline for either recirculating or flushing to the register file. 


Details: 

srce1 sre2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.offset (64-bit floating point add operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl + src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Adds the four elements in the specified vector together with a scalar, leaving the 
results in the floating point add pipeline for either recirculating or flushing to the register file. 


Details: 
srce1 src2 cycles latency temporaries | comments 
psf psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.noffset (32-bit floating point sub operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl - src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Subtracts the specified scalar from the four elements in the specified vector, leaving 
the results in the floating point add pipeline for either recirculating or flushing to the register 


file. 

Details: 

srce1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.noffset (64-bit floating point sub operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl - src2 
Constraints: 

e All operands have domain poly. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Subtracts the specified scalar from the four elements in the specified vector, leaving 
the results in the floating point add pipeline for either recirculating or flushing to the register 


file. 
Details: 
srce1 src2 cycles latency temporaries | comments 
psf psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.displace (32-bit floating point sub operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl - src2 
Constraints: 

e All operands have domain poly. 

e srci has type float. 

e srci has width of 4. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Subtracts the four elements in the specified vector from the specified scalar, leaving 
the results in the floating point add pipeline for either recirculating or flushing to the register 


file. 

Details: 

srce1 src2 cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.displace (64-bit floating point sub operation) 


Operands: srcl, src2 


Description: 


fpadd = srcl - src2 
Constraints: 

e All operands have domain poly. 

e srci has type float. 

e srci has width of 8. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Subtracts the four elements in the specified vector from the specified scalar, leaving 
the results in the floating point add pipeline for either recirculating or flushing to the register 


file. 
Details: 
src1 src2 cycles latency temporaries | comments 
psf psf 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.ascale (32-bit floating point operation) 


Operands: src 


Description: 


fpmul = src * fpadd 
Constraints: 
e All operands have domain poly. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Multiplies the result of the add pipeline with src, leaving the result in the multiply pipe 


Details: 
src cycles latency temporaries | comments 
p4ft 4 4 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.ascale (64-bit floating point operation) 


666 


Operands: src 


Description: 


fpomul = src 


Constraints: 


* fpadd 


e All operands have domain poly. 


e src has vector size of 4. 


e src has type float. 
e src has width of 8. 


Notes: Multiplies the result of the add pipeline with src, leaving the result in the multiply pipe 


Details: 
src cycles latency temporaries | comments 
pst 4 4 microcode 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


_vec.msacc (32-bit floating point multiply sub accumulate 
operation) 


Operands: src 


Description: 


fpadd = fpmul - src 
Constraints: 
e All operands have domain poly. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Accumulates the result of the multiple pipe in to the accumulator, then multiplies src1 


by src2 
Details: 
src cycles latency temporaries | comments 
p4ft 4 4 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.msacc (32-bit floating point multiply sub accumulate 
operation) 


668 


Operands: src 


Description: 


fpadd = fpmul 


Constraints: 


- sre 


e All operands have domain poly. 


e src has vector size of 4. 


e src has type float. 
e src has width of 4. 


Notes: Accumulates the result of the multiple pipe in to the accumulator, then multiplies src1 


by src2 

Details: 

src cycles latency temporaries | comments 
p4ft 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.sflush (32-bit floating point sub adder flush function) 


Operands: dst 


Description: 


dst = fpadd 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 


Notes: Flushes the values held in the floating point adder pipeline out to the register 
location indicated in dst, while asserting the neg bit as the adder pipeline is stepped 


Details: 
dst cycles latency temporaries | comments 
p4ft 4 5 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.sflush (64-bit floating point sub adder flush function) 


670 


Operands: dst 


Description: 


dst = fpadd 


Constraints: 


e All operands have domain poly. 


e dst has vector size of 4. 


e dst has type float. 
e dst has width of 8. 


Notes: Flushes the values held in the floating point adder pipeline out to the register 
location indicated in dst, while asserting the neg bit as the adder pipeline is stepped 


Details: 
dst cycles latency temporaries | comments 
psf 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.msflush (32-bit floating point sub adder flush function) 


Operands: dst 


Description: 


dst = fpadd 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 


Notes: Accumulates the multiply pipe with the adder pipe, asserting the neg bit on the first 
cycles, and flushing the adder pipe out the dst 


Details: 
dst cycles latency temporaries | comments 
p4ft 8 9 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.msflush (64-bit floating point sub adder flush function) 


672 


Operands: dst 


Description: 


dst = fpadd 


Constraints: 


e All operands have domain poly. 


e dst has vector size of 4. 


e dst has type float. 
e dst has width of 8. 


Notes: Accumulates the multiply pipe with the adder pipe, asserting the neg bit on the first 
cycles, and flushing the adder pipe out the dst 


Details: 
dst cycles latency temporaries | comments 
pst 8 9 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.mac.sum (32-bit floating point operation) 


Operands: dst 


Description: 


dst = fpmul[0] + fpmul[1] + fpmul[2] + fpmul [3] 
Constraints: 

e All operands have domain poly. 

e dst has type float. 

e dst has width of 4. 


Notes: Reduces the contents of the multiply pipeline, adding them together and writing out 


to dst 
Details: 
dst cycles latency temporaries | comments 
p4ft 11 12 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.cast.to8f (floating point cast operation) 


674 


Operands: src 


Description: 


fpadd = 


Constraints: 


(poly double)srcec 


All operands have domain poly. 
src has vector size of 4. 


Notes: Casts src to 64-bit floating point, holding the result in the adder pipe 


Details: 

src cycles latency temporaries | comments 
p2u 4 4 microcode 
p4u 4 4 microcode 
p2s 4 4 microcode 
p4s 4 4 microcode 
p4ft 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.cast.to4f (floating point cast operation) 


Operands: src 


Description: 


fpadd = (poly float)srce 
Constraints: 
e All operands have domain poly. 
e src has vector size of 4. 


Notes: Casts src to 32-bit floating point, holding the result in the adder pipe 


Details: 
src cycles latency temporaries | comments 
p2u 4 4 microcode 
p4u 4 4 microcode 
p2s 4 4 microcode 
p4s 4 4 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.cast.to4u (floating point cast operation) 


676 


Operands: src 


Description: 


fpadd = (poly unsigned int)src 


Constraints: 


e All operands have domain poly. 


e src has type float. 


e src has vector size of 4. 


Notes: Casts src to 32-bit unsigned integer, holding the result in the adder pipe 


Details: 
src cycles latency temporaries | comments 
p4ft 4 4 microcode 


Document No. 06-RM-1137 Revision: 3.A 


ClearSpeed Technology plc 


CSX600 Instruction Set Instruction set description 


_vec.cast.to4s (floating point cast operation) 


Operands: src 


Description: 


fpadd = (poly signed int)srce 
Constraints: 
e All operands have domain poly. 
e src has type float. 
e src has vector size of 4. 


Notes: Casts src to 32-bit signed integer, holding the result in the adder pipe 


Details: 
src cycles latency temporaries | comments 
p4f 4 4 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.cast.to2u (floating point cast operation) 


678 


Operands: src 


Description: 


fpadd = (poly unsigned short)srcec 


Constraints: 


e All operands have domain poly. 


e src has type float. 


e src has vector size of 4. 


Notes: Casts src to 16-bit unsigned integer, holding the result in the adder pipe 


Details: 
src cycles latency temporaries | comments 
p4ft 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.cast.to2s (floating point cast operation) 


Operands: src 


Description: 


fpadd = (poly signed short)src 
Constraints: 

e All operands have domain poly. 

e src has type float. 

e src has vector size of 4. 


Notes: Casts src to 16-bit signed integer, holding the result in the adder pipe 


Details: 
src cycles latency temporaries | comments 
p4f 4 4 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.flush.4fto8f (floating point cast operation) 


680 


Operands: dst 


Description: 


dst = (poly float) fpadd 


Constraints: 


e All operands have domain poly. 


e dst has type float. 


e dst has vector size of 4. 


e dst has width of 8. 


Notes: Casts fpadd to 64-bit floating point, storing the result in dst 


Details: 
dst cycles latency temporaries | comments 
psf 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.flush.4fto4us (floating point cast operation) 


Operands: dst 


Description: 


dst = (poly int) fpadd 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has width of 4. 


Notes: Casts fpadd to 32-bit signed or unsigned integer, storing the result in dst 


Details: 

dst cycles latency temporaries | comments 
p4u 4 5 microcode 
p4s 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.flush.4fto2us (floating point cast operation) 


Operands: dst 


Description: 


dst = (poly short) fpadd 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has width of 2. 


Notes: Casts fpadd to 16-bit signed or unsigned integer, storing the result in dst 


Details: 

dst cycles latency temporaries | comments 
p2u 4 5 microcode 
p2s 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.flush.8fto4us (floating point cast operation) 


Operands: dst 


Description: 


dst = (poly int) fpadd 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has width of 4. 


Notes: Casts fpadd to 32-bit signed or unsigned integer, storing the result in dst 


Details: 

dst cycles latency temporaries | comments 
p4u 4 5 microcode 
p4s 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.flush.8fto2us (floating point cast operation) 


Operands: dst 


Description: 


dst = (poly short) fpadd 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has width of 2. 


Notes: Casts fpadd to 16-bit signed or unsigned integer, storing the result in dst 


Details: 

dst cycles latency temporaries | comments 
p2u 4 5 microcode 
p2s 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.mul_swb (32-bit floating point multiply operation) 


Operands: dst, srcl, src2 
Description: 
fomul = srcl * src2 
dst[0] = srcl1[0] * src2[0] 

Constraints: 

e All operands have domain poly. 

e dst has vector size of 4. 

e dst has type float. 

e dst has width of 4. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Multiplies the four elements in the specified vectors together, leaving the results in 
the floating point multiply pipeline. The write back to dst is started for the first couple 


multiplied 
Details: 
dst src1 src2 cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.mul_swb (64-bit floating point multiply operation) 


Operands: dst, srcl, src2 
Description: 
fomul = srcl * src2 
dst[0] = srcl1[0] * src2[0] 

Constraints: 

e All operands have domain poly. 

e dst has vector size of 4. 

e dst has type float. 

e dst has width of 8. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Multiplies the four elements in the specified vectors together, leaving the results in 
the floating point multiply pipeline. The write back to dst is started for the first couple 


multiplied 
Details: 
dst src1 src2 cycles latency temporaries | comments 
p8f pst psf 4 5 microcode 
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CSX600 Instruction Set 


Instruction set description 


_vec.amul_ wba (32-bit floating point multiply operation) 


Operands: prev dst, dst, src 


Description: 
prev_dst = fpadd 
fomul = fpadd * src 
dst [0] = fpadd[0] * src[0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Multiplies the result off the adder with src, and populates the multiply pipe. Also 
writes back the results of the adder to prev_dst starting the write back to dst of the multiply 
pipe on the final cycle. 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.amul_wba (64-bit floating point multiply operation) 


Operands: prev dst, dst, src 


Description: 
prev_dst = fpadd 
fomul = fpadd * src 
dst [0] = fpadd[0] * src[0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has vector size of 4. 
e src has type float. 
e src has width of 8. 


Notes: Multiplies the result off the adder with src, and populates the multiply pipe. Also 
writes back the results of the adder to prev_dst starting the write back to dst of the multiply 
pipe on the final cycle. 


Details: 

prev_dst dst src cycles latency temporaries | comments 
psf pst psf 4 5 microcode 
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CSX600 Instruction Set 


Instruction set description 


_vec.mmul_wbm (32-bit floating point multiply operation) 


Operands: prev dst, dst, src 


Description: 
prev_dst = fpmul 
fpmul = fpmul * src 
dst [0] = fpmul[0] * src[0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Multiplies the result of the multiplier with src, and populates the multiply pipe. Also 
writes back the results of the multiplier to prev_dst starting the write back to dst of the 
multiply pipe on the final cycle. 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.mmul_wbm (64-bit floating point multiply operation) 


Operands: prev dst, dst, src 


Description: 
prev_dst = fpmul 
fpmul = fpmul * src 
dst [0] = fpmul[0] * src[0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has vector size of 4. 
e src has type float. 
e src has width of 8. 


Notes: Multiplies the result of the multiplier with src, and populates the multiply pipe. Also 
writes back the results of the multiplier to prev_dst starting the write back to dst of the 
multiply pipe on the final cycle. 


Details: 

prev_dst dst src cycles latency temporaries | comments 
psf pst psf 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.msquare wbm (32-bit floating point multiply operation) 


Operands: prev _ dst, dst 
Description: 
prev_dst = fpmul 


fpmul = fpmul * fmul 
dst [0] = fpmul[0] * fpmul [0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 


Notes: Squares the result of the multiplier, and populates the multiply pipe. Also writes back 
the results of the multiplier to prev_dst starting the write back to dst of the multiply pipe on 
the final cycle. 


Details: 

prev_dst dst cycles latency temporaries | comments 

p4ft p4ft 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.msquare wbm (64-bit floating point multiply operation) 


692 


Operands: prev dst, dst 
Description: 
prev_dst = fpmul 


fpmul = fpmul * fmul 
dst [0] = fpmul[0] * fpmul [0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 


Notes: Squares the result of the multiplier, and populates the multiply pipe. Also writes back 
the results of the multiplier to prev_dst starting the write back to dst of the multiply pipe on 
the final cycle. 


Details: 
prev_dst dst cycles latency temporaries | comments 
psf pst 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.scale swb (32-bit floating point multiply operation) 


Operands: dst, srcl, src2 
Description: 
fomul = srcl * src2 
dst{[0O] = srcl[0O] * src2 

Constraints: 

e All operands have domain poly. 

e dst has vector size of 4. 

e dst has type float. 

e dst has width of 4. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Multiplies the four elements in the specified vectors together, leaving the results in 
the floating point multiply pipeline. The write back to dst is started for the first couple 
multiplied 


Details: 

dst src1 src2 cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.scale swb (64-bit floating point multiply operation) 


Operands: dst, srcl, src2 
Description: 
fomul = srcl * src2 
dst[0] = srcl[0O] * src2 

Constraints: 

e All operands have domain poly. 

e dst has vector size of 4. 

e dst has type float. 

e dst has width of 8. 

e srci has vector size of 4. 

e srci has type float. 

e srcl has width of 8. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Multiplies the four elements in the specified vectors together, leaving the results in 
the floating point multiply pipeline. The write back to dst is started for the first couple 
multiplied 


Details: 

dst src1 src2 cycles latency temporaries | comments 
psf pst psf 4 5 microcode 
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Instruction set description 


_vec.mscale wbm (32-bit floating point multiply operation) 


Operands: prev dst, dst, src 


Description: 

prev_dst = fpmul 
fpmul = fpmul * src 
dst[{0] = srcl[0] * src2 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has type float. 
e src has width of 4. 


Notes: Scales the values in the multiplier pipe by src after writing back the fpmul value to 
prev_dst. Initiates the write back of the new fpmul values to dst on the last cycle 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.mscale wbm (64-bit floating point multiply operation) 


Operands: prev dst, dst, src 


Description: 

prev_dst = fpmul 
fpmul = fpmul * src 
dst[{0] = srcl[0] * src2 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has type float. 
e src has width of 8. 


Notes: Scales the values in the multiplier pipe by src after writing back the fpmul value to 
prev_dst. Initiates the write back of the new fpmul values to dst on the last cycle 


Details: 

prev_dst dst src cycles latency temporaries | comments 
psf pst pst 4 5 microcode 
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Instruction set description 


_vec.ascale wba (32-bit floating point multiply operation) 


Operands: prev dst, dst, src 


Description: 

prev_dst = fpadd 
fomul = fpadd * src 
dst[0] = srcl[0] * src2 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has type float. 
e src has width of 4. 


Notes: Scales the values in the adder pipe by src after writing back the fpadd value to 
prev_dst. Initiates the write back of the new fpmul values to dst on the last cycle 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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CSX600 Instruction Set 


_vec.ascale wba (64-bit floating point multiply operation) 


Operands: prev dst, dst, src 


Description: 

prev_dst = fpadd 
fomul = fpadd * src 
dst[0] = srcl[0] * src2 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has type float. 
e src has width of 8. 


Notes: Scales the values in the adder pipe by src after writing back the fpadd value to 
prev_dst. Initiates the write back of the new fpmul values to dst on the last cycle 


Details: 

prev_dst dst src cycles latency temporaries | comments 
psf pst pst 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.add_swb (32-bit floating point add operation) 


Operands: dst, srcl, src2 
Description: 
fpadd = srcl + src2 
dst[0] = srcl1[0] + src2[0] 

Constraints: 

e All operands have domain poly. 

e dst has vector size of 4. 

e dst has type float. 

e dst has width of 4. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Adds the four elements in the specified vectors together, leaving the results in the 
floating point multiply pipeline. The write back to dst is started for the first couple added 


Details: 

dst src1 src2 cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.add_swb (64-bit floating point add operation) 


Operands: dst, srcl, src2 
Description: 
fpadd = srcl + src2 
dst[0] = srcl1[0] + src2[0] 

Constraints: 

e All operands have domain poly. 

e dst has vector size of 4. 

e dst has type float. 

e dst has width of 8. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has vector size of 4. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Adds the four elements in the specified vectors together, leaving the results in the 
floating point multiply pipeline. The write back to dst is started for the first couple added 


Details: 

dst src1 src2 cycles latency temporaries | comments 
p8f pst psf 4 5 microcode 
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Instruction set description 


_vec.madd_wbm (32-bit floating point add operation) 


Operands: prev dst, dst, src 


Description: 
prev_dst = fpmul 
fpadd = fpmul + src 
dst[0] = fpmul[0] + src[0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Continues fomul write back while adding src in to the fpmul values and populating 


the add pipe. Starts the write back of the adder on the last cycle 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p4ft p4t p4f 4 5 microcode 
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CSX600 Instruction Set 


_vec.madd_wbm (64-bit floating point add operation) 


Operands: prev dst, dst, src 


Description: 
prev_dst = fpmul 
fpadd = fpmul + src 
dst[0] = fpmul[0] + src[0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has vector size of 4. 
e src has type float. 
e src has width of 8. 


Notes: Continues fomul write back while adding src in to the fpmul values and populating 
the add pipe. Starts the write back of the adder on the last cycle 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p8f pst psf 4 5 microcode 
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CSX600 Instruction Set 


Instruction set description 


_vec.aadd wba (32-bit floating point add operation) 


Operands: prev dst, dst, src 


Description: 
prev_dst = fpadd 
fpadd = fpadd + src 
dst[0] = fpadd[0] + src[0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Continues fpadd write back while adding src in to the foadd values and populating 


the add pipe. Starts the write back of the adder on the last cycle 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.aadd wba (64-bit floating point add operation) 


Operands: prev dst, dst, src 


Description: 
prev_dst = fpadd 
fpadd = fpadd + src 
dst[0] = fpadd[0] + src[0] 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has vector size of 4. 
e src has type float. 
e src has width of 8. 


Notes: Continues fpadd write back while adding src in to the foadd values and populating 
the add pipe. Starts the write back of the adder on the last cycle 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p8f pst psf 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.offset_swb (32-bit floating point add operation) 


Operands: dst, srcl, src2 
Description: 
fpadd = srcl + src2 
dst[0] = srcl[0] + src2 

Constraints: 

e All operands have domain poly. 

e dst has vector size of 4. 

e dst has type float. 

e dst has width of 4. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 4. 

e src2 has type float. 

e src2 has width of 4. 


Notes: Adds the scalar specified to the vector src1. Starts te write back of the add pipeline 
to dst on last cycles 


Details: 

dst src1 src2 cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.offset_swb (64-bit floating point add operation) 


Operands: dst, srcl, src2 
Description: 
fpadd = srcl + src2 
dst[0] = srcl[0] + src2 

Constraints: 

e All operands have domain poly. 

e dst has vector size of 4. 

e dst has type float. 

e dst has width of 8. 

e srci has vector size of 4. 

e srci has type float. 

e srci has width of 8. 

e src2 has type float. 

e src2 has width of 8. 


Notes: Adds the scalar specified to the vector src1. Starts te write back of the add pipeline 
to dst on last cycles 


Details: 

dst src1 src2 cycles latency temporaries | comments 
psf pst psf 4 5 microcode 
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Instruction set description 


_vec.moffset wbm (32-bit floating point add operation) 


Operands: prev dst, dst, src 


Description: 

prev_dst = fpmul 
fpadd = fpmul + src 
dst[0] = fpmul[0] + sre 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has type float. 
e src has width of 4. 


Notes: Continues fpmul write back to prev_dst. Adds src to fopmul and starts write back to 
dst on last cycle. 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description 


_vec.moffset _wbm (64-bit floating point add operation) 


Operands: prev dst, dst, src 


Description: 

prev_dst = fpmul 
fpadd = fpmul + src 
dst[0] = fpmul[0] + sre 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has type float. 
e src has width of 8. 


Notes: Continues fpmul write back to prev_dst. Adds src to fpmul and starts write back to 
dst on last cycle. 


Details: 

prev_dst dst src cycles latency temporaries | comments 
psf pst pst 4 5 microcode 
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CSX600 Instruction Set 


Instruction set description 


_vec.aoffset wba (32-bit floating point add operation) 


Operands: prev dst, dst, src 


Description: 

prev_dst = fpadd 
fpadd = fpadd + src 
dst[0] = fpadd[0] + sre 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 4. 
e src has type float. 
e src has width of 4. 


Notes: Continues fpadd write back to prev_dst. Adds src to fpadd and starts write back to 
dst on last cycle. 


Details: 

prev_dst dst src cycles latency temporaries | comments 
p4ft p4t p4t 4 5 microcode 
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Instruction set description 


_vec.aoffset wba (64-bit floating point add operation) 


Operands: prev dst, dst, src 


Description: 

prev_dst = fpadd 
fpadd = fpadd + src 
dst[0] = fpadd[0] + sre 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 8. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has type float. 
e src has width of 8. 


Notes: Continues fpadd write back to prev_dst. Adds src to fpadd and starts write back to 
dst on last cycle. 


Details: 

prev_dst dst src cycles latency temporaries | comments 
psf pst pst 4 5 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.masub wbm (32-bit floating point add operation) 


Operands: prev dst, src 
Description: 
prev_dst = fpmul 


fpadd = fpmul - src 
dst[0] = fpadd[0] - sre 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Continues fomul write back to prev_dst. Subs src from fpmul, leaving results in add 


pipe. 

Details: 

prev_dst src cycles latency temporaries | comments 

p4ft p4ft 4 4 microcode 
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Instruction set description CSX600 Instruction Set 


_vec.masub wbm (32-bit floating point add operation) 


Operands: prev dst, src 
Description: 
prev_dst = fpmul 


fpadd = fpmul - src 
dst[0] = fpadd[0] - sre 


Constraints: 
e All operands have domain poly. 
e prev_dst has vector size of 4. 
e prev_dst has type float. 
e prev_dst has width of 4. 
e src has vector size of 4. 
e src has type float. 
e src has width of 4. 


Notes: Continues fomul write back to prev_dst. Subs src from fpmul, leaving results in add 


pipe. 
Details: 
prev_dst src cycles latency temporaries | comments 
p4ft p4ft 4 4 microcode 
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CSX600 Instruction Set Instruction set description 


_vec.square swb (64-bit floating point add operation) 


Operands: dst, src 
Description: 
fpomul = src * src 
dst[0] = src[0] * src[0] 
Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e src has vector size of 4. 
e src has type float. 
e src has width of 8. 


Notes: Multiplies src by itself and starts write back to dst 


Details: 

dst src cycles latency temporaries | comments 

psf pst 4 5 microcode 
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Instruction set description 


_vec.mscale setacc_wbm (64-bit floating point add operation) 


Operands: dst, srcl, src2 
Description: 
dst = fpmul 


fpadd = srcl + 0 
fomul = fpmul * src3 


Constraints: 
e All operands have domain poly. 
e dst has vector size of 4. 
e dst has type float. 
e dst has width of 8. 
e srci has type float. 
e srci has width of 8. 
e src2 has type float. 
e src2 has width of 8. 


Notes: Writes back mul pipe to dst, then muliplies mul pipe by src2 while setting the add 
pipe to contain src1 


CSX600 Instruction Set 


Details: 

dst src1 src2 cycles latency temporaries | comments 
pst pst psf microcode 
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Instruction set description 


_vec.moffset_amul (64-bit floating point add operation) 


Operands: srcl, src2 


Description: 


fpadd = fpmul + srcl 
fpomul = fpadd * src2 


Constraints: 


All operands have domain poly. 


srcl1 has vector size of 4. 
srci has type float. 
srci has width of 8. 
src2 has type float. 
src2 has width of 8. 


Notes: Adds the fpmul values and src1 together, storing in the adder, while concurrently 
multiplying the foadd values with src2 and storing in the add pipe 


Details: 

srce1 src2 cycles latency temporaries | comments 

psf psf 4 microcode 
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Instruction set description 


CSX600 Instruction Set 


_vec.maadd wba (64-bit floating point add operation) 


716 


Operands: dst 


Description: 


dst = fpmul + fpadd 


Constraints: 


e All operands have domain poly. 


e dst has vector size of 4. 


e dst has type float. 
e dst has width of 8. 


Notes: Adds the multiply pipe and the add pipe together and writes out to reg file 


Details: 
dst cycles latency temporaries | comments 
psf 7 8 microcode 
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Index 


.get 529 
Sig 530 
.put 534 
Sig 433 
.pe 
reg 
.get 531 
Sig 532 
.put 533 
Sig 560 


add 
floating point 32 bit 49 
floating point 64 bit 50 
integer 48 


addc 
integer 51 


Addressing 16 
Modes 
Direct 17, 22 
Indexed 17 
Indirect 17 
Offset 17 
Strided 22 
Registers 6 


aeo 
.data 
.get 553 
shift 551 
result 552 


Alignment 8, 11 


all 
disable 198 


ALU 16 
Mono 18 


Poly 20 


and 93 
.extend 94 


andif 


cry 185 

.eg 
floating point 32 bit 
floating point 64 bit 
integer 161 

.fpadd 


neg §=6193 
-Nneg 194 
.nzero.) =—192 
zero 191 
.ge 
floating point 32 bit 
floating point 64 bit 
signed 172 
unsigned 171 
.gt 
floating point 32 bit 
floating point 64 bit 
signed 168 
unsigned 167 


floating point 32 bit 
floating point 64 bit 
signed 158 
unsigned 157 


floating point 32 bit 
floating point 64 bit 


signed 154 
unsigned 153 
-ncry 186 

ne 


floating point 32 bit 
floating point 64 bit 
integer 164 


159 
160 


169 
170 


165 
166 


155 
156 


151 
152 


162 
163 
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neg = 189 Architecture 13 
-Nneg 190 PE 20 
.nzero§=6188 Processor 13,15 
zero. 187 SIMD 19 
any asr 84 
enable 197 
Branch instructions 19 break 509 
stop 510 


Branching 21 


integer 1 byte poly to integer 2 byte 
poly 36 

integer 1 byte poly to integer 4 byte 
poly 38 

integer 2 byte mono to integer 4 byte 
mono 32 

integer 2 byte poly to integer 1 byte 
poly 34 

integer 2 byte poly to integer 4 byte 
poly 33 

integer 4 byte mono to integer 2 byte 
mono 29 

integer 4 byte poly to integer 1 byte 
poly 31 


Cache 13 integer 4 byte poly to integer 2 byte 
poly 30 
call 
.varargs 315, 316, 317, 318, 319, 320 cmp 
floating point 32 bit 100 
Carry flag floating point 64 bit 101 
Mono 9, 16, 19 signed 103 
Poly 9, 21 unsigned 102 
cast are , 
32 bit float to 64 bit float 44 meeting Pome Se iiee EN 
4 byte integer to 64 bit float 41 neaang point Gre’ at 
4 byte integer to float 40 nee 
64 bit float to 32 bit float 43 Be . 
64 bit float to 4 byte integer 42 fibating PoInte a2 ws nen 
float to 4 byte integer 39 Nearing POMESS PI. 73 
integer 1 byte mono to integer 2 byte signed iy 
mono 35 unsigned 126 
integer 1 byte mono to integer 4 byte gt 
mono 37 floating point 32 bit 120 


floating point 64 bit 121 
signed 123 
unsigned 122 
le 
floating point 32 bit 110 
floating point 64 bit 111 
signed 113 
unsigned 112 
{It 
106 
107 


floating point 32 bit 
floating point 64 bit 
signed 109 
unsigned 108 

.ne 
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e 


floating point 32 bit 117 
floating point 64 bit 118 
integer 119 


cmpc 
signed 105 
unsigned 104 


complex 
-mul 586, 587 


Conditional code 17 
Mono 17, 19 
Poly 18, 21 


Consolidation 22 


Constraints 7, 10 
Alignment 8, 11 
Domains of operands 8, 11 


Data cache 13 


dcache 
invalidate 511,512, 513 
.writeback 514, 515, 516 


Direct addressing 17 


div 
floating point 32 bit 72 
floating point 64 bit 73 
integer 71 


else 195 
Enable 
Constraints 11 
Register 20 
State 9,21 
Testing on mono execution unit 
18 
Swazzle 22 
fdup 28 


Enable register bits 11 


Types of operands 8, 11 


Vectors 8 
Widths 8, 11 


context 
load 508 
store 507 


Control registers 
Mono 18 


Control unit 13, 14 
Cycle counts 10 


cycles 
.get 499 
.put 500 


.end 

floating point 64 bit 
-normalised 

div 536 
.start 

floating point 64 bit 


Domain 
of instructions 6,9 
of operands 6, 8, 11 


dup 27 


enable 
.get 501 
.put 502 


endif 196 


Execution unit 13 
Mono 13, 18 
Poly 13,19 


fld 
poly with offset 403 


538 


537 
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poly, no offset 402 Poly 21 
.index Mono 18 
poly with offset 416 Inexact 16, 19 
poly, no offset 415 Underflow 16, 19 
Poly 20 
Floating point Inexact 21 
See FPU Underflow 21 
fino 26 Underflow 
Mono 16, 19 
Forced Poly 21 
Instructions 9 
Load 21 fst 
Store 21 poly with offset 411 
poly, no offset 410 
FPU .index 
Inexact poly with offset 420 
Mono 16, 19 poly, no offset 419 


Hardware instructions 10 


VO 20 neg §=183 
Asynchronous 22 .Nneg = 184 
Consolidated 22 .Nzero) =—182 
Direct addressed 22 zero 181 
Inter-PE 22 .ge 
Programmed 21 floating point 32 bit 147 
Registers 18 floating point 64 bit 148 
Semaphores 19, 22 signed 150 
Strided 22 unsigned 149 
Swazzle 22 .gt 

floating point 32 bit 143 

icache floating point 64 bit 144 


invalidate 517,518,519 


signed 146 
.prefetch 302 J 


unsigned 145 


if le 
floating point 32 bit 133 
cry 173 floating point 64 bit 134 
.eq signed 136 
floating point 32 bit 137 unsigned 135 
floating point 64 bit 138 lt 
integer 139 floating point 32 bit 129 
.fpadd floating point 64 bit 130 
signed 132 
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unsigned 131 
-msb 179 
-ncry 174 
.ne 
floating point 32 
floating point 64 
integer 142 
neg 177 
-nmsb-~ =. 180 
Nneg 178 
.nzero §=6176 
zero. 175 


If instructions 21 
Indexed addressing 17 
Indirect addressing 17 


Instruction cache 13 


j 215 
if 252 


cry 220 
.eq 240 
float 32 bit 
float 64 bit 
.ge 
float 32 bit 
float 64 bit 
signed 251 


bit 
bit 


238 
239 


248 
249 


unsigned 250 


.gt 
float 32 bit 
float 64 bit 
signed 247 


244 
245 


unsigned 246 


le 
float 32 bit 
float 64 bit 
signed 237 


234 
235 


unsigned 236 


lt 
float 32 bit 


230 


Instruction set 14 


Instructions 


Branch 19 
Constraints 7, 10 
Cycles 10 
Description 10 
Domain 6,9 
Forced 9 
Format 5 
Hardware 10 

If 21 

Jump 19 

Macro 9,10 
Microcode 10 
Mono 7 

Poly 7,9 

Side effects 8, 11 
Timing 10 


float 64 bit 231 


signed 233 
unsigned 232 
.msb 226 
cry 221 
ne =243 


float 32 bit 241 
float 64 bit 242 
neg §=224 
.Amsb =. 227 
Nneg §=225 
-Novf 229 
Npred = 217 
.Nzero) §=—.223 
vf §=228 
pred 216 
.preds 
.combined 
cand = 219 
or 218 
zero. =—-.222 
ifn 253 
sub =. 254 
jf 293 
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.cry 259 
.eq 279 
float 32 bit 277 
float 64 bit 278 
.ge 
float 32 bit 289 
float 64 bit 290 
signed 292 
unsigned 291 
.gt 
float 32 bit 285 
float 64 bit 286 
signed 288 
unsigned 287 
le 
float 32 bit 273 
float 64 bit 274 
signed 276 
unsigned 275 
lt 
float 32 bit 269 
float 64 bit 270 


signed 272 
unsigned 271 
.msb = 265 
-Ncry 260 
ne = 284 


float 32 bit 280, 282 
float 64 bit 281, 283 
neg 263 
.Amsb = =266 
.nneg 264 
-novf 268 
.Npred 256 
.nzero §=262 
.OVF 267 
.pred 255 
.preds 
.combined 
cand =258 
or =257 
zero 6.261 
ifn 294 


jr 321 
.if 


“sub 


cry 324 
.eq 


float 32 bit 342 
float 64 bit 343 
signed 345 
unsigned 344 


.ge 


float 32 bit 354 
float 64 bit 355 
signed 357 
unsigned 356 


.gt 


float 32 bit 350 
float 64 bit 351 
signed 353 


le 


float 32 bit 338 
float 64 bit 339 
signed 341 
unsigned 340 


float 32 bit 334 
float 64 bit 335 
signed 337 
unsigned 336 


-msb 330 
Ncry 325 
.ne 


float 32 bit 346 
float 64 bit 347 
signed 349 


unsigned 348, 352 
neg 328 
Amsb =. 331 
nneg 329 
-novf 333 
Npred 323 
.Nzero §=—327 
.ovf §=3332 
.pred 322 
zero. §=—-3326 


358 
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cry 361 
.eg 
float 32 bit 379 
float 64 bit 380 
signed 382 
unsigned 381 
.ge 
float 32 bit 391 
float 64 bit 392 
signed 394 
unsigned 393 
.gt 
float 32 bit 387 
float 64 bit 388 
signed 390 
le 
float 32 bit 375 
float 64 bit 376 
signed 378 
unsigned 377 
lt 
float 32 bit 371 


Id 
L 397 


mono with offset 
mono, no offset 396 
poly with offset 400 
poly, no offset 399 
.direct 
mono with offset 424 
mono, no offset 423 
yield 
mono with offset 432 
mono, no offset 431 
.index 
poly with offset 414 
poly, no offset 413 
yield 
mono with offset 428 
mono, no offset 427 


Literals 6 


M MAC 20 
Mono 18, 20 


float 64 bit 372 
signed 374 
unsigned 373 


.msb 367 
cry 362 
ne 


float 32 bit 383 
float 64 bit 384 
signed 386 
unsigned 385, 389 


neg §=365 
.Amsb =. 368 
.Nneg 366 
-nNovf 370 
.Npred 360 
.Nzero }=6364 
.ovf 369 
.pred 359 
zero = 3363 


Jump instructions 19 


Load 


Forced 21 
Poly 20, 21 


Is 
.index 
.fget 
.fput 


Isl 
unsigned 


Islc 
unsigned 


Isr 
unsigned 


Isrc 
unsigned 


Poly 20 


422 
421 


75 


88 


80 


89 
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Macro instructions 9, 10 mov 24 
Memory MSB flag 
PE 20 Mono 9, 16, 19 
Poly 20 Poly 9, 21 
Microcode instructions 10 MTAP 
See Multi-Threaded Array Processor 
Mono 
ALU 18 mul 
Carry flag 9, 16, 19 floating point 65 
Conditional code 17, 19 floating point 64 bit 66 
Control registers 18 integer 62, 63 
Execution unit 13, 18 hi 
FP Inexact 16, 19 integer 70 
FP Underflow 16, 19 lo 
FPU 18 integer 67, 68 
Instructions 7 
MAC 18, 20 Multi-Threaded Array Processor 13 
MSB flag 9, 16, 19 Multi-threading support 14 
Negative flag 9, 16, 19 yO 22 
Overflow flag 9, 16, 19 Priority 14 


Registers 18 
Result registers 18 
Status register 18 


Registers 18 
Semaphores 19 


Predicates 18 mutex 
Zero flag 9,16, 19 ac 
tart 521 
mono yield 
ls start 522 
-base end = 525 
.put 434 pio 
‘result start 523 
.get 527 yield 
‘put 528 start 524 
start 520 
N neg negc 
floating point 32 bit 59 integer 61 
floating point 64 bit 60 
integer 58 nop 494 
.poly 495 
Negative flag 
Mono 9, 16, 19 normalise 540 
Poly 9, 21 not 91 
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extend 92 
O Offset addressing 17 Type 6 
Vector 6 
Operands 10 Vectors 8 
Constraints 7, 10 
Domain 6 or 95 
Immediate 6 .extend 96 
Overlapping 11 
Register 6 Overflow flag 
Size 6 Mono 9, 16, 19 
Specifiers 6 Poly 9, 21 
Syntax 5 Overlapping operands 
P PC atomic 
See Program counter .end 458 
Start 456 
oe eae ) yield 
ee start 457 
Architecture 20 
C ae .data 
ommunication get 464 
Enable state 9, 21 : 
aa Sig 463 
Inter-PE communications 22 
M 50 .put 461 
Sey Sig 462 
Swazzle 22 
.putget 
penum 497 semaphore 
.put 472 
PIO sig 465 
See Programmed I/O .strided 
; .base 
pie ad increment 
< ae .put 470 
aele t 459 .put 468 
‘pu .update 469 
.Offset 
t 466, 467 ee 
- 460 write 482 
ree : flush 483 
selects read 484 
read 478 
—_ NO 
een address 
write 474 ; 
write 
.con 476 back 
flush 477 size 
flush 475 .put 471 


11 


485 
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.write 480 .mono 
flush 481 cand 45, 46 
.transfer 
.semaphore pred 
put 473 cand =204 
clr ~=200 
pioc .get 212 
.reg mov 202 
.get 544, 545 -naandb = 211 
put 548, 549 -nand = 205 
-naorb §=210 
pioe nor 207 
reg 
get 546, 547 pie pe 
.put 542, 543 ‘or 206 
Poly .put 213 
ALU 20 set 201 
Carry flag 9, 21 ‘xor 208 


Conditional code 18, 21 


Predicated instructions 21 
Execution unit 13, 19 


FP Inexact 21 Predicates 18 

FP Underflow 21 User defined 19 
FPU 20 

Instructions 7,9 Priority 

Load 20, 21 Threads 14 

Pa 20 Processing element 
Memory 20 See also Poly 

MSB flag 9, 21 See PE 

Negative flag 9,21 

Overflow flag 9, 21 Processor 

Registers 20 Architecture 13,15 
Status register 20 

Store 20, 21 Program counter 18 


Zero flag 9, 21 Programmed I/O 21 


a“ Programming model 14 
R Registers 16 Result 18 
Addressing 6 Status 18 
Control 18 Poly 20 
Mono 18 Status 20 
Enable 9, 20, 21 Predicates 18 
Vo 18 Program counter 18 
Mono 18 Result 18 
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Mono 18 
Swazzle 22 
Return address 18 

Status 8, 11, 16 
Mono 18 

Poly 20 

Temporary 9, 10 


Result registers 18 
Mono 18 


S sem 
.get 487 


.request 559 
.put 488, 489 
select 556 
Sig 490 
sync 6492 

selected 558 
wait 491 

selected 557 


Semaphores 19, 22 
Yo 19 
Signal 19 
Wait 19 


set 
.fixedarg 314 
.vararg §313 


.varargs 303, 304, 305, 306, 307, 
308, 309, 310, 311, 312 


Side effects 8, 10, 11 
Signal 19 


SIMD 13, 19, 21 


Single Instruction Multiple Data 


See SIMD 


st 


mono with offset 406 
mono, no offset 405 


poly with offset 408 
poly, no offset 407 


return 295 
.get 296 
«high §=298 
low 297 
.put 299 
-high =301 
low 300 


Return address register 


.direct 
mono with offset 
mono, no offset 
.index 
poly with offset 
poly, no offset 
yield 
mono with offset 
mono, no offset 


Stack 
Enable 9, 21 


status 
.fpadd 
.get 505 
.fpmul 
.get 506 
.get 503 
.put 504 


Status registers 8, 11 
Mono 18 
Poly 20 


Store 
Forced 21 
Poly 20, 21 


sub 
floating point 32 bit 
floating point 64 bit 
integer 53 


subc 
integer 56 


18 


426 
425 


418 
417 


430 
429 


16 


54 
55 
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Swazzle 20, 22 
and enable state 22 
Result registers 22 


swazzle 

.circular 
-hightolow 447 
-hightolowx2 448 
-hightolowx4 449 
lowtohigh 444 
lowtohighx2 445 
.lowtohighx4 446 
Swap 

.odd 
.up 450 
ehigh 


-put 451 


.get 453 


T Temporary registers 9, 10 
terminate 496 


thread 
.get 498 


Threads 


V _vec 


.aadd_wba 
32-bit floating point add operation 
703 
64-bit floating point add operation 
704 
-acc 
32-bit floating point accumulate 
operation 641 
64-bit floating point accumulate 
operation 642 
.add 
32-bit floating point add operation 
637 
64-bit floating point add operation 
638 


-hightolow 439 
-hightolowx2 440 
-hightolowx4 441 
.low 


.put 452 


.get 454 
lowtohigh 436 
-lowtohighx2 437 
-lowtohighx4 438 
.Swap 

even 

up =442 
.odd 

.up 443 


Syntax 5 
Operands 5 


Priority 14 


Timing 
Instructions 10 


Types 6,7 
of operands 8, il 


.add_swb 
32-bit floating point add operation 
699 
64-bit floating point add operation 
700 
.aflush 
32-bit floating point adder flush 
function 643 
64-bit floating point adder flush 
function 644 
-amul_wba 


32-bit floating point multiply 
operation 687 
64-bit floating point multiply 


operation 688 
.aoffset_wba 
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32-bit floating point add operation 
709 


64-bit floating point add operation 

710 
.ascale 

32-bit floating point operation 665 
64-bit floating point operation 666 
.ascale_wba 

32-bit floating point multiply 
operation 697 

64-bit floating point multiply 
operation 698 
.cast 

.to2s 


floating point cast operation 
679 


.to2u 


floating point cast operation 
678 


.to4f 


floating point cast operation 
675 


.to4s 


floating point cast operation 
677 


.to4u 


floating point cast operation 
676 


.to8f 


floating point cast operation 
674 


.displace 


32-bit floating point sub operation 
663 


64-bit floating point sub operation 
664 


flush 
.4fto2us 


floating point cast operation 
682 


.4fto4us 


floating point cast operation 
681 


.4fto8f 


floating point cast operation 
680 


.8fto2us 


floating point cast operation 
684 


.8fto4us 


floating point cast operation 
683 


.maadd_wba 


64-bit floating point add operation 
716 


«Mac 


sum 


32-bit floating point operation 
673 


«Macc 


32-bit floating point multiply 
accumulate operation 649, 651 
64-bit floating point multiply 
accumulate operation 650, 652 
.postmul 
32-bit floating point multiply 
accumulate operation 653, 655 


64-bit floating point multiply 
accumulate operation 654, 656 


.madd_wbm 


32-bit floating point add operation 
701 

64-bit floating point add operation 
702 


.maflush 


32-bit floating point multiply and 
accumulate flush function 647 
64-bit floating point multiply and 
accumulate flush function 648 


.masub_wbm 


32-bit floating point add operation 
711, 712 


.mflush 


32-bit floating point multiply flush 
function 645 
64-bit floating point multiply flush 
function 646 


.mmul_wbm 


32-bit floating point multiply 
operation 689 
64-bit floating point multiply 
operation 690 


.moffset_amul 


64-bit floating point add operation 
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715 


.moffset_wbm 


32-bit floating point add operation 
707 

64-bit floating point add operation 
708 


-«MSacc 


32-bit floating point multiply sub 
accumulate operation 667, 668 


.mscale_setacc_wbm 


64-bit floating point add operation 
714 


.mscale_wbm 


32-bit floating point multiply 
operation 695 
64-bit floating point multiply 
operation 696 


.msflush 


32-bit floating point sub adder flush 
function 671 
64-bit floating point sub adder flush 
function 672 


.msquare_wbm 


32-bit floating point multiply 
operation 691 
64-bit floating point multiply 
operation 692 


.mul 


32-bit floating point multiply 
operation 635 
64-bit floating point multiply 
operation 636 


.mul_swb 


32-bit floating point multiply 
operation 685 
64-bit floating point multiply 
operation 686 


noffset 


32-bit floating point sub operation 
661 

64-bit floating point sub operation 
662 


.offset 


32-bit floating point add operation 
659 

64-bit floating point add operation 
660 


.offset_swb 


32-bit floating point add operation 
705 

64-bit floating point add operation 
706 


scale 


32-bit floating point multiply 
operation 657 
64-bit floating point multiply 
operation 658 


.scale_swb 


32-bit floating point multiply 
operation 693 
64-bit floating point multiply 
operation 694 


.Sflush 


32-bit floating point sub adder flush 
function 669 
64-bit floating point sub adder flush 
function 670 


.square_swb 


64-bit floating point add operation 
713 


sub 


32-bit floating point subtract 
operation 639 
64-bit floating point subtract 
operation 640 


.add 


.begin 570, 571 
end 574, 575 
tail 572, 573 


.mul 


.begin 580, 581 
end 584, 585 
tail 582, 583 


.mulacc 


-begin 562, 563 
.end 568, 569 
-head =564, 565 
step 566, 567 


sub 


begin 576, 577 
tail 578, 579 


ClearSpeed Technology plc 


Document No. 06-RM-1137 Revision: 3.A 


CSX600 Instruction Set 


WwW 


vector 


.add 
floating point 32-bit 589 
floating point 64-bit 590 
scalar 
floating point 32-bit 591 
floating point 64-bit 592 
addmul 
floating point 32-bit 611 
floating point 64-bit 612 
scalar 
floating point 32-bit 613 
floating point 64-bit 614 
.cast 


vector 16-bit integer to vector 32-bit 
float 622 


vector 16-bit integer to vector 64-bit 
float 626 


vector 32-bit float to vector 16-bit int 
624 


vector 32-bit float to vector 32-bit int 
623 


vector 32-bit float to vector 64-bit 
float 629 


vector 32-bit integer to vector 32-bit 
float 621 


vector 32-bit integer to vector 64-bit 
float 625 


vector 64-bit float to vector 16-bit int 
628 

vector 64-bit float to vector 32-bit 

float 630 


vector 64-bit float to vector 32-bit int 
627 


.displace 
floating point 32-bit 597 
floating point 64-bit 598 
-mul 
floating point 32-bit 599 
floating point 64-bit 600 
“scalar 
floating point 32-bit 601 


Wait 19 


floating point 64-bit 602 
-mulacc 
floating point 32-bit 603 
floating point 64-bit 604 
scalar 
floating point 32-bit 605 
floating point 64-bit 606 
-mulnegacc 
floating point 32-bit 607 
floating point 64-bit 608 
scalar 
floating point 32-bit 609 
floating point 64-bit 610 
neg 
floating point 32-bit 619 
floating point 64-bit 620 
.reduce 
.add 


32-bit floating point add 
reduction 631 


64-bit floating point add 
reduction 632 
-mul 
32-bit floating point multiply 
reduction 633 
64-bit floating point multiply 
reduction 634 
sub 
floating point 32-bit 593 
floating point 64-bit 594 
scalar 
floating point 32-bit 595 
floating point 64-bit 596 
.submul 
floating point 32-bit 615 
floating point 64-bit 616 
scalar 
floating point 32-bit 617 
floating point 64-bit 618 


Vector operands 6, 8 


Widths 
of operands 8, il 
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xX xor 97 .extend 98 


Z Zero flag Poly 9, 21 
Mono 9, 16, 19 
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